Computer Architectures

Total Page:16

File Type:pdf, Size:1020Kb

Computer Architectures Computer Architectures Processor+Memory Richard Šusta, Pavel Píša Czech Technical University in Prague, Faculty of Electrical Engineering AE0B36APO Computer Architectures Ver.1.00 2019 1 The main instruction cycle of the CPU 1. Initial setup/reset – set initial PC value, PSW, etc. 2. Read the instruction from the memory PC → to the address bus Read the memory contents (machine instruction) and transfer it to the IR PC+l → PC, where l is length of the instruction 3. Decode operation code (opcode) 4. Execute the operation compute effective address, select registers, read operands, pass them through ALU and store result 5. Check for exceptions/interrupts (and service them) 6. Repeat from the step 2 AE0B36APO Computer Architectures 2 Single cycle CPU – implementation of the load instruction lw: type I, rs – base address, imm – offset, rt – register where to store fetched data I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU A RD Instr. A2 RD2 SrcB AluOut Data ReadData Memory Memory A3 Reg. WD3 File WD 15:0 Sign Ext SignImm B35APO Computer Architectures 3 Single cycle CPU – implementation of the load instruction lw: type I, rs – base address, imm – offset, rt – register where to store fetched data I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 Write on rising edge of CLK RegWrite = 1 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU A RD Instr. A2 RD2 SrcB AluOut Data ReadData 20:16 Memory Memory A3 Reg. WD3 File WD 15:0 Sign Ext SignImm B35APO Computer Architectures 4 Single cycle CPU – implementation of the load instruction lw: type I, rs – base address, imm – offset, rt – register where to store fetched data I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 RegWrite = 1 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU A RD Instr. A2 RD2 SrcB AluOut Data ReadData Memory 20:16 Memory A3 Reg. WD3 File WD + 15:0 4 Sign Ext SignImm PCPlus4 B35APO Computer 5 Architectures Single cycle CPU – implementation of the store instruction sw: type I, rs – base address, imm – offset, rt – select register to store into memory I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 RegWrite = 0 MemWrite = 1 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU 20:16 A RD Instr. A2 RD2 SrcB AluOut Data ReadData Memory 20:16 Memory A3 Reg. WD3 File WD + 15:0 4 Sign Ext SignImm PCPlus4 B35APO Computer Architectures 6 Switch analogy Multiplexer 2 to 1 1bit Select (address) Select z x x z 0 0 0 x1 1 x1 B35APO Computer Architectures 7 Single cycle CPU – implementation of the add instruction add: type R, rs, rt – source, rd – destination, funct – select ALU operation = add R opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 rd(5), 15:11 shamt(5) funct(6), 5:0 RegWrite = 1 ALUSrc = 0 MemToReg = 0 RegDst = 1 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU 0 Result A RD 1 20:16 Instr. A2 RD2 0 SrcB AluOut Data ReadData Memory 1 Memory A3 Reg. WD3 File WriteData WD 20:16 Rt 0 WriteReg 1 15:11 Rd + 15:0 4 Sign Ext SignImm PCPlus4 B35APO Computer Architectures 8 Single cycle CPU – sub, and, or, slt Only difference is another ALU operation selection (ALUcontrol). The data path is the same as for add instruction RegWrite = 1 ALUSrc = 0 MemToReg = 0 RegDst = 1 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU 0 Result A RD 1 20:16 Instr. A2 RD2 0 SrcB AluOut Data ReadData Memory 1 Memory A3 Reg. WD3 File WriteData WD 20:16 Rt 0 WriteReg 1 15:11 Rd + 15:0 4 Sign Ext SignImm PCPlus4 B35APO Computer Architectures 9 Single cycle CPU – implementation of beq • beq – branch if equal; imm–offset; PC´ = PC+4 + SignImm*4 I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 RegWrite = 0 ALUSrc = 0 Branch = 1 MemToReg = x RegDst = x ALUControl 25:21 WE3 SrcA Zero WE 0 PC’ PC Instr A1 RD1 A RD ALU 0 Result 1 A RD 1 20:16 Instr. A2 RD2 0 SrcB AluOut Data ReadData Memory 1 Memory A3 Reg. WD3 File WriteData WD 20:16 Rt 0 WriteReg 1 15:11 Rd + 4 15:0 <<2 Sign Ext SignImm PCBranch + PCPlus4 B35APO Computer Architectures 10 Single cycle CPU – Throughput: IPS = IC / T = IPCstr.fCLK • What is the maximal possible frequency of the CPU? • It is given by latency on the critical path – it is lw instruction in our case: Tc = tPC + tMem + tRFread + tALU + tMem + tMux + tRFsetup 25:21 WE3 SrcA WE 0 PC’ PC Instr A1 RD1 Zero A RD ALU 0 Result 1 A RD 1 20:16 Instr. A2 RD2 0 SrcB AluOut Data ReadData Memory 1 Memory A3 Reg. WD3 File WriteData WD 20:16 Rt 0 WriteReg 1 15:11 Rd + 4 15:0 <<2 Sign Ext SignImm PCBranch + PCPlus4 B35APO Computer Architectures 11 Single cycle CPU – Throughput: IPS = IC / T = IPCstr.fCLK • Tc = Tcinstr + Tcproc = (tPC + tMem) + (tRFread + tALU + tMem + tMux + tRFsetup) • Consider following parameters: tPC = 30 ns tMem = 300 ns tRFread = 50 ns tALU = 200 ns tMux = 20 ns tRFsetup = 20 ns Then Tc = 920 ns --> fCLK max = 1,08 MHz, IPS = 1 080 000 [instructions per second] If Tc instr is executed paralel with Tc proc, then Tc i < Tc p, and Tc p = 50+200+300+20+20 = 590 ns = 1.69 MHz -> IPS = 1 690 000 B35APO Computer Architectures 12 Note • Remember the result, so you can compare it with result for pipelined CPU. B35APO Computer Architectures 13 Key Technology Gaps Prediction Note: The increase of algorithm complexity over time has been formalized in literature with the so-defined Shannon's law. B35APO Computer Architectures 14 Memory and CPU speed – Moore's law more exact numbers CPU 25% 52% 20% performance per year per year per year Processor-Memory Performance Gap Growing Throughput of memory only +7% per year Source: Hennesy, Patterson CaaQA 4th ed. 2006 B35APO Computer Architectures 15 Layout of a Program in Memory lower address Other programs .ent start Text Segment entry point, initial value of PC Static Initialized Data Data Segment Static Uninitialized Data Dynamic Area, heap Heap grows (cz: halda) to higher address Stack grows Stack Segment to lower address higher address (cz:zásobník) B35APO Computer Architectures Memory Addressing Memory address in load and store instructions is specified by a base register and offset This is called base addressing addi a0, zero, 10 // load the word from absolute address lw $2, 0x2000($0) // store the word to absolute address sw $2, 0x2004($0) Data Types in Instructions MIPS registers hold 32-bit (4-byte) words. Other common data sizes include byte and halfword. Byte = 8 bits - example: lb / lbu - load byte extend signed/unsigned Halfword = 2 bytes example: - lh / lhu - load halfword extend signed/unsigned Word = 4 bytes - example: - lw - load word u-unsigned extension 18 Data Directives .WORD Directive Stores the list as 32-bit values aligned on a word boundary .WORD w:n Directive Stores the 32-bit value w into n consecutive words aligned on a word boundary. .HALF Directive Stores the list as 16-bit values aligned on half-word boundary .HALF w:n Directive Stores the 16-bit value w into n consecutive half-words aligned on a half-word boundary. .BYTE Directive Stores the list of values as 8-bit bytes .BYTE w:n Directive Stores the 8-bit value w into n consecutive bytes. B35APO Computer Architectures String Directives .ASCII Directive Allocates a sequence of bytes for an ASCII string .ASCIIZ Directive Same as .ASCII directive, but adds a NULL char at end of string Strings are null-terminated, as in the C programming language .SPACE n Directive Allocates space of n uninitialized bytes in the data segment Special characters in strings follow C convention Newline: \n Tab:\t Quote: \” B35APO Computer Architectures Memory Alignment .align n directive - aligns the next data definition begins on a 2n byte boundary .align 2 - the least significant 2 bits of address should be 00 Memory Memory is addressed as an . array of bytes address aligned word Words occupy 4 consecutive 12 not aligned 8 bytes (MIPS is 32bit processor), 4 0 not aligned B35APO Computer Architectures Align Assembler align data in data segment .DATA .ALIGN 2 var1: .BYTE 3, 5,'A','P','O' var2: .WORD 0x12345678 .ALIGN 3 var3: .HALF 1000 Example on BIG ENDIAN var1 var2 Address 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x2000 3 5 41 50 4F 12 34 56 78 0x2010 10 00 var3 B35APO Computer Architectures Lecture motivation Quick Quiz 1.: Is the result of both code fragments a same? Quick Quiz 2.: Which of the code fragments is processed faster and why? A: B: int matrix[M][N]; int matrix[M][N]; int i, j, sum = 0; int i, j, sum = 0; … … for(i=0; i<M; i++) for(j=0; j<N; j++) for(j=0; j<N; j++) for(i=0; i<M; i++) sum += matrix[i][j]; sum += matrix[i][j]; Is there a rule how to iterate over matrix element efficiently? B35APO Computer Architectures 23 PC Computer Motherboard B35APO Computer Architectures 24 http://www.pcplanetsystems.com/abc/product_details.php?item_id=3263&category_id=208 Computer architecture (desktop x86 PC) example generic B35APO Computer Architectures 25 From UMA to NUMA development (even in PC segment) Non-Uniform Memory Architecture CPU 1 CPU 2 CPU 1 CPU 2 RAM RAM RAM RAM RAM RAM MC MC MC MC MC Northbridge MC Northbridge CPU 1 CPU 2 CPU 1 CPU 2 RAM Southbridge Southbridge Southbridge Southbridge USB USB USB USB SATA PCI-E SATA PCI-E SATA PCI-E SATA PCI-E MC - Memory controller – contains circuitry responsible for SDRAM read and writes.
Recommended publications
  • Load Management and Demand Response in Small and Medium Data Centers
    Thiago Lara Vasques Load Management and Demand Response in Small and Medium Data Centers PhD Thesis in Sustainable Energy Systems, supervised by Professor Pedro Manuel Soares Moura, submitted to the Department of Mechanical Engineering, Faculty of Sciences and Technology of the University of Coimbra May 2018 Load Management and Demand Response in Small and Medium Data Centers by Thiago Lara Vasques PhD Thesis in Sustainable Energy Systems in the framework of the Energy for Sustainability Initiative of the University of Coimbra and MIT Portugal Program, submitted to the Department of Mechanical Engineering, Faculty of Sciences and Technology of the University of Coimbra Thesis Supervisor Professor Pedro Manuel Soares Moura Department of Electrical and Computers Engineering, University of Coimbra May 2018 This thesis has been developed under the Energy for Sustainability Initiative of the University of Coimbra and been supported by CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brazil). ACKNOWLEDGEMENTS First and foremost, I would like to thank God for coming to the conclusion of this work with health, courage, perseverance and above all, with a miraculous amount of love that surrounds and graces me. The work of this thesis has also the direct and indirect contribution of many people, who I feel honored to thank. I would like to express my gratitude to my supervisor, Professor Pedro Manuel Soares Moura, for his generosity in opening the doors of the University of Coimbra by giving me the possibility of his orientation when I was a stranger. Subsequently, by his teachings, guidance and support in difficult times. You, Professor, inspire me with your humbleness given the knowledge you possess.
    [Show full text]
  • A Modern Primer on Processing in Memory
    A Modern Primer on Processing in Memory Onur Mutlua,b, Saugata Ghoseb,c, Juan Gomez-Luna´ a, Rachata Ausavarungnirund SAFARI Research Group aETH Z¨urich bCarnegie Mellon University cUniversity of Illinois at Urbana-Champaign dKing Mongkut’s University of Technology North Bangkok Abstract Modern computing systems are overwhelmingly designed to move data to computation. This design choice goes directly against at least three key trends in computing that cause performance, scalability and energy bottlenecks: (1) data access is a key bottleneck as many important applications are increasingly data-intensive, and memory bandwidth and energy do not scale well, (2) energy consumption is a key limiter in almost all computing platforms, especially server and mobile systems, (3) data movement, especially off-chip to on-chip, is very expensive in terms of bandwidth, energy and latency, much more so than computation. These trends are especially severely-felt in the data-intensive server and energy-constrained mobile systems of today. At the same time, conventional memory technology is facing many technology scaling challenges in terms of reliability, energy, and performance. As a result, memory system architects are open to organizing memory in different ways and making it more intelligent, at the expense of higher cost. The emergence of 3D-stacked memory plus logic, the adoption of error correcting codes inside the latest DRAM chips, proliferation of different main memory standards and chips, specialized for different purposes (e.g., graphics, low-power, high bandwidth, low latency), and the necessity of designing new solutions to serious reliability and security issues, such as the RowHammer phenomenon, are an evidence of this trend.
    [Show full text]
  • (BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics
    Bit-Exact ECC Recovery (BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics Minesh Pately Jeremie S. Kimzy Taha Shahroodiy Hasan Hassany Onur Mutluyz yETH Zurich¨ zCarnegie Mellon University Increasing single-cell DRAM error rates have pushed DRAM entirely within the DRAM chip [39, 76, 120, 129, 138]. On-die manufacturers to adopt on-die error-correction coding (ECC), ECC is completely invisible outside of the DRAM chip, so ECC which operates entirely within a DRAM chip to improve factory metadata (i.e., parity-check bits, error syndromes) that is used yield. e on-die ECC function and its eects on DRAM relia- to correct errors is hidden from the rest of the system. bility are considered trade secrets, so only the manufacturer Prior works [60, 97, 98, 120, 129, 133, 138, 147] indicate that knows precisely how on-die ECC alters the externally-visible existing on-die ECC codes are 64- or 128-bit single-error cor- reliability characteristics. Consequently, on-die ECC obstructs rection (SEC) Hamming codes [44]. However, each DRAM third-party DRAM customers (e.g., test engineers, experimental manufacturer considers their on-die ECC mechanism’s design researchers), who typically design, test, and validate systems and implementation to be highly proprietary and ensures not to based on these characteristics. reveal its details in any public documentation, including DRAM To give third parties insight into precisely how on-die ECC standards [68,69], DRAM datasheets [63,121,149,158], publi- transforms DRAM error paerns during error correction, we cations [76, 97, 98, 133], and industry whitepapers [120, 147].
    [Show full text]
  • 256M(16Mx16) GDDR SDRAM HY5DU561622CT
    HY5DU561622CT 256M(16Mx16) GDDR SDRAM HY5DU561622CT This document is a general product description and is subject to change without notice. Hynix Electronics does not assume any responsibility for use of circuits described. No patent licenses are implied. Rev. 1.3 / Oct. 2004 HY5DU561622CT Revision History Revision History Draft Date Remark No. 0.1 Defined Preliminary Specification Dec. 2002 0.2 Defined IDD Spec. Feb. 2003 0.3 Improvement of VDD from 2.8V to 2.5V in 300MHz Mar. 2003 0.4 Changed IDD Spec. Apr. 2003 0.5 166MHz Speed added. May. 2003 1) Changed VDD Value of HY5DU561622CT-33/36 from 2.5V to 2.6V 0.6 June 2003 2) Added tRC@Auto Precharge Parameter in AC CHARACTERISTICS - I 0.7 Added 350MHz Speed June 2003 0.8 Changed VDD/VDDQ min/max range of HY5DU561622CT- 33 /36 July 2003 1) Changed tRAS_max Value from 120K to 100K in All Frequency 0.9 2) Changed IDD6 value from 3mA to 4mA in All Frequency Aug. 2003 3) Changed Refresh Time from 64ms to 32ms in All Frequency 1.0 Refresh Time restore to 64ms from 32ms Jun. 2004 1.1 Insert AC Overshoot comment Aug. 2004 1.2 tRAS_max change Sep. 2004 1.3 tRAS_min & tQHS change Oct. 2004 Rev. 1.3 / Oct. 2004 2 HY5DU561622CT PRELIMINARY DESCRIPTION The Hynix HY5DU561622CT is a 268,435,456-bit CMOS Double Data Rate(DDR) Synchronous DRAM, ideally suited for the point-to-point applications which requires high bandwidth. The Hynix 16Mx16 DDR SDRAMs offer fully synchronous operations referenced to both rising and falling edges of the clock.
    [Show full text]
  • An Analysis of Solderability and Manufacturability of Moisture
    An Analysis of Solderability and Manufacturability of Moisture Sensitive Electronic Components After Long Term Environmentally Controlled Storage Channel One International LLC 400 N. Tampa St. Suite 2850 Tampa, FL 33602 By: Lee Melatti Copyright 2012 Channel One International LLC. and Lee Melatti Subject Device Description with RH of typically less than 1%. The specially The subject device is comprised of five (5) very large designed cabinets are data logged and alarmed with scale integrated circuits mounted on a high density remote monitoring to alert in the case of an out of multi-chip hybrid module. The part number of the specification event. The humidity control is dual hybrid module is 216T9NGBGA13FHG with a redundant and the subject devices remained within device description of ATI (now an AMD company) the controlled environment at all times prior to the Mobility Radeon ™ 9000 M9-CSP64 Graphics analysis. Processor Unit, RoHS compliant. The module is an FR4 material Printed Circuit Board (PCB) mounted The test units were put into long-term environmental with an ATI designed GPU circuit in a plastic storage on 11/10/2007. The storage conditions were encapsulated Fine Ball Grid Array (FBGA) package. maintained from that date until two (2) devices were This is then mounted on the bottom (ball) side of the removed from long-term storage on 11/30/2011 and PCB and conformal coated. Additionally, there are shipped to an independent firm for analysis. The four (4) 128Mbit GDDR SDRAM devices from independent engineering firm is a military and Samsung Semiconductor, part number K4D26323G- avionics system manufacturer with extensive VC33 in 144 ball FBGA packages, mounted to the experience in component quality analysis.
    [Show full text]
  • Mithril: Cooperative Row Hammer Protection on Commodity DRAM Leveraging Managed Refresh
    Mithril: Cooperative Row Hammer Protection on Commodity DRAM Leveraging Managed Refresh Michael Jaemin Kim, Jaehyun Park, Yeonhong Park, Wanju Doh, Namhoon Kim, Tae Jun Ham, Jae W. Lee, and Jung Ho Ahn Seoul National University [email protected] Abstract—Since its public introduction in the mid-2010s, the computer system. RH-induced bit-flip can be abused in various Row Hammer (RH) phenomenon has drawn significant attention attack scenarios. For example, the RH can be utilized to from the research community due to its security implications. modify an unauthorized memory location or enable other Although many RH-protection schemes have been proposed by processor vendors, DRAM manufacturers, and academia, they attack techniques such as privilege escalation and cross-VM still have shortcomings. Solutions implemented in the memory attacks [1], [11], [12], [17], [41], [49], [53]. controller (MC) pay an increasingly higher cost due to their The criticality of this problem motivated many RH protec- conservative design for the worst case in terms of the number tion solutions. There exist several software-based solutions [5], of DRAM banks and RH threshold to support. Meanwhile, the [9], [19], [30], [49], but such solutions often incur a high- DRAM-side implementation has a limited time margin for RH protection measures or requires extensive modifications to the performance cost and have limited coverage (i.e., only ef- standard DRAM interface. fective against a specific attack scenario). For these reasons, Recently, a new command for RH protection has been intro- architectural solutions have emerged as promising alternatives. duced in the DDR5/LPDDR5 standards, called refresh manage- By providing the RH mitigation at the DRAM device level, ment (RFM).
    [Show full text]
  • Product Guide SAMSUNG ELECTRONICS RESERVES the RIGHT to CHANGE PRODUCTS, INFORMATION and SPECIFICATIONS WITHOUT NOTICE
    捷多邦,您值得信赖的PCB打样专家! Apr. 2010 Consumer Memory Product Guide SAMSUNG ELECTRONICS RESERVES THE RIGHT TO CHANGE PRODUCTS, INFORMATION AND SPECIFICATIONS WITHOUT NOTICE. Products and specifications discussed herein are for reference purposes only. All information discussed herein is provided on an "AS IS" basis, without warranties of any kind. This document and all information discussed herein remain the sole and exclusive property of Samsung Electronics. No license of any patent, copyright, mask work, trademark or any other intellectual property right is granted by one party to the other party under this document, by implication, estoppel or other- wise. Samsung products are not intended for use in life support, critical care, medical, safety equipment, or similar applications where product failure could result in loss of life or personal or physical harm, or any military or defense application, or any governmental procurement to which special terms or provisions may apply. For updates or additional information about Samsung products, contact your nearest Samsung office. All brand names, trademarks and registered trademarks belong to their respective owners. ⓒ 2010 Samsung Electronics Co., Ltd. All rights reserved. - 1 - Apr. 2010 Product Guide Consumer Memory 1. CONSUMER MEMORY ORDERING INFORMATION 1 2 3 4 5 6 7 8 9 10 11 K 4 X X X X X X X X - X X X X SAMSUNG Memory Speed DRAM Temperature & Power Package Type Product Revision Density & Refresh Interface (VDD, VDDQ) Organization Bank 1. SAMSUNG Memory : K 7. Interface ( VDD, VDDQ) 2 : LVTTL (3.3V, 3.3V) 2. DRAM : 4 8 : SSTL_2 (2.5V, 2.5V) 3. Product Q : SSTL_18 (1.8V, 1.8V) 6 : SSTL_15 (1.5V, 1.5V) S : SDRAM H : DDR SDRAM K : POD_18 (1.8V, 1.8V) T : DDR2 SDRAM B : DDR3 SDRAM 8.
    [Show full text]
  • Constructing Effective UVM Testbench for DRAM Memory Controllers
    Constructing Effective UVM testbench for DRAM Memory Controllers Khaled Salah1, Hassan Mostafa2 1Mentor, a Siemens Business, Egypt. 2 Electronics and Communications Engineering Department, Cairo University, Giza 12613, Egypt. [email protected] [email protected] TABLE II presents comparison between the most Abstract—In this paper, a general verification common architecture in terms of commands. TABLE III architecture for DRAM memory controllers is compares between different memories controllers. proposed. The proposed verification architecture is Enhancing the verification environment is a challenge based on universal verification methodology (UVM) for DRAM memory controllers as they are very time which makes use of the common features between consuming parts. different DRAM memory controllers to generate In this paper, generic UVM-based verification common and configurable scoreboard, sequences, architecture is proposed to verify the DRAM memory stimulus, different UVM components, payload and controllers. The proposed architecture exploits the test-cases. The proposed verification architecture uses common features to generate common UVM components and tests. The rest of paper is organized as follows. In minimum number of macros, methods and classes. Section II, The proposed architecture is introduced. In The proposed verification architecture provides high Section III, results are analyzed. Conclusions are given in reusability for UVM tests. section IV. TABLE I MEMORY CONTROLLERS FEATURES Index Terms — UVM, DRAM, Verification, Features Explanation Architecture. Topology Point to point, or multi-master/ multi-slave. Physical interface Pins. Initialization process Start operation and negotiation. I. INTRODUCTION Command Sets • Read • Write • multiple read As verification is now considered as the bottleneck of • multiple write any complex VLSI design. So, improving the verification • Activate efficiency is a must.
    [Show full text]
  • Introducing Micron DDR5 SDRAM: More Than a Generational Update
    A MICRON WHITE PAPER Introducing Micron® DDR5 SDRAM: More Than a Generational Update By Scott Schlachter and Brian Drake Introduction DDR5, the successor of DDR4, has been developed to deliver performance improvements at a time when system designers are feeling increasing pressure from continuous technological advancements—where current memory bandwidth is simply unable to keep up with newer processor models that have increasing core counts. DDR5 is the fifth-generation double data rate (DDR) SDRAM, and the feature enhancements from DDR4 to DDR5 are the greatest yet. While previous generations focused on reducing power consumption and were driven by applications such as mobile and data center, DDR5’s primary driver has been the need for more bandwidth. Compared to DDR4 at an equivalent data rate of 3200 megatransfers per second (MT/s), a DDR5 system-level simulation example indicates an approximate performance increase of 1.36X effective bandwidth. At a higher data rate, DDR5-4800, the approximate performance increase becomes 1.87X—nearly double the bandwidth as compared to DDR4-3200. Figure 1: Effective Bandwidth: DDR4 vs. DDR51 Driven by data rates up to 6400 MT/s and key architectural improvements, Micron’s DDR5 is pushing potential system bandwidth even higher. This white paper discusses some of the key architectural improvements of DDR5 and, specifically, how they enable significant bandwidth growth over DDR4. 1. Source: Micron. Bandwidth normalized to x64 interface, 64B random accesses, 66% reads, dual-rank x4 simulation, 16Gb. Best estimates; subject to change. 1 A MICRON WHITE PAPER Meeting Next-Generation CPU Requirements At the system-level, despite only modest clock rate improvements, the transition to multicore CPU architectures has enabled continuous year-over-year compute performance gains.
    [Show full text]
  • Видове Dram Памети - Sdr Sdram, Ddr, Ddr2, Ddr3, Ddr4, Ddr5 И Rambus
    Видове DRAM памети - SDR SDRAM, DDR, DDR2, DDR3, DDR4, DDR5 и RAMBUS. Сравнителен анализ. I. Въведение При персоналните компютри се използва два вида RAM - динамична (DRAM) и статична (SRAM). Статичната памет, използвана при персоналните компютри, се нарича кеш памет. Времето за достъп до клетка от статичната памет е много по-малко отколкото времето за достъп при динамичната памет. Причината за това е начина на функциониране на запомнящите клетки при двата вида памети. Докато при SRAM запомнящият елемент е D тригер, то при DRAM запомнящият елемент е интегрален кондензатор (използва се капацитета между изводите дрейн и сорс на интегрален транзистор). На фиг. 1 е показана структурата на една запомняща клетка от DRAM. ключ U вх U изх Генератор на еталонен ток I зареждане I разреждане Електронен кондензатор Фиг. 1. Структура на запомняща клетка от DRAM Запомнящият елемент е електронен кондензатор с капацитет около 0.1pF. Той се зарежда за определен интервал от време до стойността на входното напрежение U вх (лог. 0 или лог. 1). За да може времето за зареждане на кондензатора да е независимо от стойността на входното напрежение, токът на зареждане се получава от генератор за еталонен ток. Зареждането започва след затваряне на ключа (транзистор, включен като електронен ключ). След зареждане на кондензатора, ключът се отваря. От този момент нататък запомнящата клетка не консумира електроенергия, но кондензаторът започва да се разрежда, поради тока на утечка. Преди момента на прекомерно намаляване на изходното напрежение U изх., ключът отново се затваря. Този процес на зареждане-разреждане на електронните кондензатори продължава непрекъснато по време на работа на компютъра и се нарича опресняване на динамичната памет.
    [Show full text]
  • US 2015/0160872 A1 CHEN (43) Pub
    US 2015.0160872A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2015/0160872 A1 CHEN (43) Pub. Date: Jun. 11, 2015 (54) OPERATION METHOD OF DISTRIBUTED (52) U.S. Cl. MEMORY DISKCLUSTER STORAGE CPC ............ G06F 3/0619 (2013.01); G06F 3/0689 SYSTEM (2013.01); G06F 3/0664 (2013.01) (71) Applicant: HSUN-YUAN CHEN, TAIPEI CITY 116 (TW) (57) ABSTRACT (72) Inventor: HSUN-YUAN CHEN, TAIPEI CITY 116 (TW) The present invention relates to anoperation method of dis (21) Appl. No.: 14/562,892 tributed memory disk cluster storage system, a distributed rº 3. memory storage system is adopted thereby satisfying four (22) Filed: Dec. 8, 2014 desired expansions which are the expansion of network band - - - - - - width, the expansion of hard disk capacity, the expansion of (30) Foreign Application Priority Data IOPS speed, and the expansion of memory I/O transmitting speed. Meanwhile, the system can be cross-region operated, Dec. 9, 2013 (TW) --------------------------------- 102145155 data center and WAN, so the user’s requirements can be Publication Classification collected through the local memory disk cluster for being provided with the correspondingp services, the capacity of the (51) Int. CI. memory disk cluster can also be gradually expanded for fur G06F 3/06 (2006.01) ther providing cross-region or cross-country data service. user end user end user end user end iSCSI FC FCOE NAS LAN P SAN virtualized Scheme Connection LAN iP T T T------—----- 25 - - - - - - - - - - - - I | 1 | 101 102 | 103 - - — — — — - - - - - - - - - - —ill r- - - - - - - - - - - - t - - - - Th|APP Eliº AFF EAFF EAFF | 10 iiii |Tºº, ?º º Tºº | | 11 || | i || ; : |||| | } i || ||| | | 13 13 { } || 13 13 || | 13 13 l– — — — — — — — — — — — I li- ~ — — — — — — |# - - | | | 20 i 20 i | virtualized distributed switch || i 1/8/10/16/40/56/100 1/8/10/16/40/56!400 1/8/10/16/40/56/100 Gbe physical switch Gbe physical switch Gbe physical switch router | SSL VPN or VPN Connection WAN iP º| Patent Application Publication Jun.
    [Show full text]
  • R60, R60e, R61, and R61i
    ® ThinkPad R60, R60e, R61, and R61i Hardware Maintenance Manual November 2007 This manual supports: ThinkPad R60 (MT 9455, 9456, 9457, 9458, 9459, 9444, 9445, 9460, 9461, 9462, 9463, 9464, 9446, and 9447) ThinkPad R60e (MT 0656, 0657, 0658, and 0659) ThinkPad R61 (MT 8942, 8943, 8944, 8945, 8947, 8948, and 8949) ThinkPad R61i (MT 0656, 0657, 0658, 0659, 9455, 9456, 9457, 9458, 9459, 9444, 9445, 9460, 9461, 9462, 9463, 9464, 9446, 9447, 8942, 8943, 8944, 8945, 8947, 8948, and 8949) ® ThinkPad R60, R60e, R61, and R61i Hardware Maintenance Manual Note Before using this information and the product it supports, be sure to read the general information under “Notices” on page 260. Third Edition (November 2007) © Copyright Lenovo 2007. All rights reserved. LENOVO products, data, computer software, and services have been developed exclusively at private expense and are sold to governmental entities as commercial items as defined by 48 C.F.R. 2.101 with limited and restricted rights to use, reproduction and disclosure. LIMITED AND RESTRICTED RIGHTS NOTICE: If products, data, computer software, or services are delivered ″ ″ pursuant a General Services Administration GSA contract, use, reproduction, or disclosure is subject to restrictions set forth in Contract No. GS-35F-05925. Contents About this manual . .1 Retaining serial numbers . .54 Removing and replacing a FRU . .57 1010 Battery pack . .58 Introduction . .3 1020 Ultrabay Enhanced device for R60, R61, and Important service information . .3 R61i . .59 Strategy for replacing FRUs . .3 1030 Hard disk drive cover, hard disk drive, and Strategy for replacing a hard disk drive . .4 hard disk drive rubber rails .
    [Show full text]