(BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics

Total Page:16

File Type:pdf, Size:1020Kb

(BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics Bit-Exact ECC Recovery (BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics Minesh Pately Jeremie S. Kimzy Taha Shahroodiy Hasan Hassany Onur Mutluyz yETH Zurich¨ zCarnegie Mellon University Increasing single-cell DRAM error rates have pushed DRAM entirely within the DRAM chip [39, 76, 120, 129, 138]. On-die manufacturers to adopt on-die error-correction coding (ECC), ECC is completely invisible outside of the DRAM chip, so ECC which operates entirely within a DRAM chip to improve factory metadata (i.e., parity-check bits, error syndromes) that is used yield. e on-die ECC function and its eects on DRAM relia- to correct errors is hidden from the rest of the system. bility are considered trade secrets, so only the manufacturer Prior works [60, 97, 98, 120, 129, 133, 138, 147] indicate that knows precisely how on-die ECC alters the externally-visible existing on-die ECC codes are 64- or 128-bit single-error cor- reliability characteristics. Consequently, on-die ECC obstructs rection (SEC) Hamming codes [44]. However, each DRAM third-party DRAM customers (e.g., test engineers, experimental manufacturer considers their on-die ECC mechanism’s design researchers), who typically design, test, and validate systems and implementation to be highly proprietary and ensures not to based on these characteristics. reveal its details in any public documentation, including DRAM To give third parties insight into precisely how on-die ECC standards [68,69], DRAM datasheets [63,121,149,158], publi- transforms DRAM error paerns during error correction, we cations [76, 97, 98, 133], and industry whitepapers [120, 147]. introduce Bit-Exact ECC Recovery (BEER), a new methodol- Because the unknown on-die ECC function is encapsulated ogy for determining the full DRAM on-die ECC function (i.e., within the DRAM chip, it obfuscates raw bit errors (i.e., pre- its parity-check matrix) without hardware tools, prerequisite correction errors)1 in an ECC-function-specic manner. ere- knowledge about the DRAM chip or on-die ECC mechanism, fore, the locations of soware-visible uncorrectable errors (i.e., or access to ECC metadata (e.g., error syndromes, parity infor- post-correction errors) oen no longer match those of the pre- mation). BEER exploits the key insight that non-intrusively correction errors that were caused by physical DRAM error inducing data-retention errors with carefully-craed test pat- mechanisms. While this behavior appears desirable from a terns reveals behavior that is unique to a specic ECC function. black-box perspective, it poses serious problems for third-party We use BEER to identify the ECC functions of 80 real DRAM customers who study, test and validate, and/or design LPDDR4 DRAM chips with on-die ECC from three major systems based on the reliability characteristics of the DRAM DRAM manufacturers. We evaluate BEER’s correctness in chips that they buy and use. Section 2.2 describes these cus- simulation and performance on a real system to show that tomers and the problems they face in detail, including, but not BEER is eective and practical across a wide range of on-die limited to, three important groups: (1) system designers who ECC functions. To demonstrate BEER’s value, we propose and need to ensure that supplementary error-mitigation mecha- discuss several ways that third parties can use BEER to improve nisms (e.g., rank-level ECC within the DRAM controller) are their design and testing practices. As a concrete example, we carefully designed to cooperate with the on-die ECC func- introduce and evaluate BEEP, the rst error proling method- tion [40, 129, 160], (2) large-scale industries (e.g., computing ology that uses the known on-die ECC function to recover the system providers such as Microso [33], HP [47], and Intel [59], number and bit-exact locations of unobservable raw bit errors DRAM module manufacturers [4, 92, 159]) or government enti- responsible for observable post-correction errors. ties (e.g., national labs [131,150]) who must understand DRAM 1. Introduction reliability characteristics when validating DRAM chips they buy and use, and (3) researchers who need full visibility into arXiv:2009.07985v1 [cs.AR] 17 Sep 2020 Dynamic random access memory (DRAM) is the predomi- physical device characteristics to study and model DRAM reli- nant choice for system main memory across a wide variety of ability [17, 20, 31, 42, 43, 46, 72, 78–86, 109, 138, 139, 172, 178]. computing platforms due to its favorable cost-per-bit relative to other memory technologies. DRAM manufacturers main- For each of these third parties, merely knowing or reverse- tain a competitive advantage by improving raw storage densi- engineering the type of ECC code (e.g., n-bit Hamming code) ties across device generations. Unfortunately, these improve- based on existing industry [60, 97, 98, 120, 133, 147] and aca- ments largely rely on process technology scaling, which causes demic [129,138] publications is not enough to determine exactly serious reliability issues that reduce factory yield. DRAM how the ECC mechanism obfuscates specic error paerns. manufacturers traditionally mitigate yield loss using post- is is because an ECC code of a given type can have many manufacturing repair techniques such as row/column spar- dierent implementations based on how its ECC function (i.e., ing [51]. However, continued technology scaling in mod- its parity-check matrix) is designed, and dierent designs lead ern DRAM chips requires stronger error-mitigation mecha- to dierent reliability characteristics. For example, Figure1 nisms to remain viable because of random single-bit errors shows the relative probability of observing errors in dierent that are increasingly frequent at smaller process technology bit positions for three dierent ECC codes of the same type (i.e., nodes [39,76,89,99,109,119,120,124,127,129,133,160]. erefore, single-error correction Hamming code with 32 data bits and DRAM manufacturers have begun to use on-die error correction 1We use the term “error” to refer to any bit-ip event, whether observed coding (on-die ECC), which silently corrects single-bit errors (e.g., uncorrectable bit-ips) or unobserved (e.g., corrected by ECC). 1 6 parity-check bits) but that use dierent ECC functions. We BEER exploits the key insight that forcing the ECC function to obtain this data by simulating 109 ECC words using the EINSim act upon carefully-craed uncorrectable error paerns reveals simulator [2, 138] and show medians and 95% condence inter- ECC-function-specic behavior that disambiguates dierent vals calculated via statistical bootstrapping [32] over 1000 sam- ECC functions. BEER comprises three key steps: (1) deliber- ples. We simulate a 0xFF test paern2 with uniform-random ately inducing uncorrectable data-retention errors by pausing pre-correction errors at a raw bit error rate of 10–4 (e.g., as oen DRAM refresh while using carefully-craed test paerns to seen in experimental studies [17,20,43,46,76,102,109,139,157]). control the errors’ bit-locations, which is done by leveraging data-retention errors’ intrinsic data-paern asymmetry (dis- 0.04 cussed in Section 3.2), (2) enumerating the bit positions where the ECC mechanism causes miscorrections, and (3) using a SAT solver [28] to solve for the unique parity-check matrix that 0.02 causes the observed set of miscorrections. Probability Pre-Correction Post-Correction (ECC Function 1) Relative Error Post-Correction (ECC Function 0) Post-Correction (ECC Function 2) We experimentally apply BEER to 80 real LPDDR4 DRAM 0.00 chips with on-die ECC from three major DRAM manufacturers 0 5 10 15 20 25 30 to determine the chips’ on-die ECC functions. We describe Bit Index in Dataword the experimental steps required to apply BEER to any DRAM Figure 1: Relative error probabilities in dierent bit posi- chip with on-die ECC and show that BEER tolerates observed tions for dierent ECC functions with uniform-randomly dis- experimental noise. We show that dierent manufacturers ap- tributed pre-correction (i.e., raw) bit errors. pear to use dierent on-die ECC functions while chips from e data demonstrates that ECC codes of the same type can the same manufacturer and model number appear to use the have vastly dierent post-correction error characteristics. is same on-die ECC function (Section 5.1.3). Unfortunately, our is because each ECC mechanism acts dierently when faced experimental studies with real DRAM chips have two limita- with more errors than it can correct (i.e., uncorrectable errors), tions against further validation: (1) because the on-die ECC causing it to mistakenly perform ECC-function-specic “correc- function is considered trade secret for each manufacturer, we tions” to bits that did not experience errors (i.e., miscorrections, are unable to obtain a groundtruth to compare BEER’s results which Section 3.3 expands upon). erefore, a researcher or en- against, even when considering non-disclosure agreements gineer who studies two DRAM chips that use the same type of with DRAM manufacturers and (2) we are unable to publish ECC code but dierent ECC functions may nd that the chips’ the nal ECC functions that we uncover using BEER for con- soware-visible reliability characteristics are quite dierent dentiality reasons (discussed in Section 2.1). even if the physical DRAM cells’ reliability characteristics are To overcome the limitations of experimental studies with identical. On the other hand, if we know the full ECC function real DRAM chips, we rigorously evaluate BEER’s correctness in (i.e., its parity-check matrix), we can calculate exactly which simulation (Section6). We show that BEER correctly recovers pre-correction error paern(s) result in a set of observed er- the on-die ECC function for 115300 single-error correction rors.
Recommended publications
  • HP Z620 Memory Configurations and Optimization
    HP recommends Windows® 7. HP Z620 Memory Configurations and Optimization The purpose of this document is to provide an overview of the memory configuration for the HP Z620 Workstation and to provide recommendations to optimize performance. Supported Memory Modules1 Memory Features The types of memory supported on a HP Z620 are: ECC is supported on all of our supported DIMMs. • 2 GB and 4 GB PC3-12800E 1600MHz DDR3 • Single-bit errors are automatically corrected. Unbuffered ECC DIMMs • Multi-bit errors are detected and will cause the • 4 GB and 8 GB PC3-12800R 1600MHz DDR3 system to immediately reboot and halt with an Registered DIMMs F1 prompt error message. • 1.35V and 1.5V DIMMs are supported, but the Non-ECC memory does not detect or correct system will operate the DIMMs at 1.5V only. single-bit or multi-bit errors which can cause • 2 Gb and 4 Gb based DIMMs are supported. instability, or corruption of data, in the platform. See the Memory Technology White Paper for See the Memory Technology White Paper for additional technical information. additional technical information. Command and Address parity is supported with Platform Capabilities Registered DIMMs. Maximum capacity Optimize Performance • Single processor: 64 GB Generally, maximum memory performance is • Dual processors: 96 GB achieved by evenly distributing total desired memory capacity across all operational Total of 12 memory sockets channels. Proper individual DIMM capacity • 8 sockets on the motherboard: 4 channels per selection is essential to maximizing performance. processor and 2 sockets per channel On the second CPU, installing the same • 4 sockets on the 2nd CPU and memory amount of memory as the first CPU will optimize module: 4 channels per processor and 1 performance.
    [Show full text]
  • Interfacing EPROM to the 8088
    Chapter 10: Memory Interface Introduction Simple or complex, every microprocessor-based system has a memory system. Almost all systems contain two main types of memory: read-only memory (ROM) and random access memory (RAM) or read/write memory. This chapter explains how to interface both memory types to the Intel family of microprocessors. MEMORY DEVICES Before attempting to interface memory to the microprocessor, it is essential to understand the operation of memory components. In this section, we explain functions of the four common types of memory: read-only memory (ROM) Flash memory (EEPROM) Static random access memory (SRAM) dynamic random access memory (DRAM) Memory Pin Connections – address inputs – data outputs or input/outputs – some type of selection input – at least one control input to select a read or write operation Figure 10–1 A pseudomemory component illustrating the address, data, and control connections. Address Connections Memory devices have address inputs to select a memory location within the device. Almost always labeled from A0, the least significant address input, to An where subscript n can be any value always labeled as one less than total number of address pins A memory device with 10 address pins has its address pins labeled from A0 to A9. The number of address pins on a memory device is determined by the number of memory locations found within it. Today, common memory devices have between 1K (1024) to 1G (1,073,741,824) memory locations. with 4G and larger devices on the horizon A 1K memory device has 10 address pins. therefore, 10 address inputs are required to select any of its 1024 memory locations It takes a 10-bit binary number to select any single location on a 1024-location device.
    [Show full text]
  • Load Management and Demand Response in Small and Medium Data Centers
    Thiago Lara Vasques Load Management and Demand Response in Small and Medium Data Centers PhD Thesis in Sustainable Energy Systems, supervised by Professor Pedro Manuel Soares Moura, submitted to the Department of Mechanical Engineering, Faculty of Sciences and Technology of the University of Coimbra May 2018 Load Management and Demand Response in Small and Medium Data Centers by Thiago Lara Vasques PhD Thesis in Sustainable Energy Systems in the framework of the Energy for Sustainability Initiative of the University of Coimbra and MIT Portugal Program, submitted to the Department of Mechanical Engineering, Faculty of Sciences and Technology of the University of Coimbra Thesis Supervisor Professor Pedro Manuel Soares Moura Department of Electrical and Computers Engineering, University of Coimbra May 2018 This thesis has been developed under the Energy for Sustainability Initiative of the University of Coimbra and been supported by CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brazil). ACKNOWLEDGEMENTS First and foremost, I would like to thank God for coming to the conclusion of this work with health, courage, perseverance and above all, with a miraculous amount of love that surrounds and graces me. The work of this thesis has also the direct and indirect contribution of many people, who I feel honored to thank. I would like to express my gratitude to my supervisor, Professor Pedro Manuel Soares Moura, for his generosity in opening the doors of the University of Coimbra by giving me the possibility of his orientation when I was a stranger. Subsequently, by his teachings, guidance and support in difficult times. You, Professor, inspire me with your humbleness given the knowledge you possess.
    [Show full text]
  • A Modern Primer on Processing in Memory
    A Modern Primer on Processing in Memory Onur Mutlua,b, Saugata Ghoseb,c, Juan Gomez-Luna´ a, Rachata Ausavarungnirund SAFARI Research Group aETH Z¨urich bCarnegie Mellon University cUniversity of Illinois at Urbana-Champaign dKing Mongkut’s University of Technology North Bangkok Abstract Modern computing systems are overwhelmingly designed to move data to computation. This design choice goes directly against at least three key trends in computing that cause performance, scalability and energy bottlenecks: (1) data access is a key bottleneck as many important applications are increasingly data-intensive, and memory bandwidth and energy do not scale well, (2) energy consumption is a key limiter in almost all computing platforms, especially server and mobile systems, (3) data movement, especially off-chip to on-chip, is very expensive in terms of bandwidth, energy and latency, much more so than computation. These trends are especially severely-felt in the data-intensive server and energy-constrained mobile systems of today. At the same time, conventional memory technology is facing many technology scaling challenges in terms of reliability, energy, and performance. As a result, memory system architects are open to organizing memory in different ways and making it more intelligent, at the expense of higher cost. The emergence of 3D-stacked memory plus logic, the adoption of error correcting codes inside the latest DRAM chips, proliferation of different main memory standards and chips, specialized for different purposes (e.g., graphics, low-power, high bandwidth, low latency), and the necessity of designing new solutions to serious reliability and security issues, such as the RowHammer phenomenon, are an evidence of this trend.
    [Show full text]
  • Itanium-Based Solutions by Hp
    Itanium-based solutions by hp an overview of the Itanium™-based hp rx4610 server a white paper from hewlett-packard june 2001 table of contents table of contents 2 executive summary 3 why Itanium is the future of computing 3 rx4610 at a glance 3 rx4610 product specifications 4 rx4610 physical and environmental specifications 4 the rx4610 and the hp server lineup 5 rx4610 architecture 6 64-bit address space and memory capacity 6 I/O subsystem design 7 special features of the rx4610 server 8 multiple upgrade and migration paths for investment protection 8 high availability and manageability 8 advanced error detection, correction, and containment 8 baseboard management controller (BMC) 8 redundant, hot-swap power supplies 9 redundant, hot-swap cooling 9 hot-plug disk drives 9 hot-plug PCI I/O slots 9 internal removable media 10 system control panel 10 ASCII console for hp-ux 10 space-saving rack density 10 complementary design and packaging 10 how hp makes the Itanium transition easy 11 binary compatibility 11 hp-ux operating system 11 seamless transition—even for home-grown applications 12 transition help from hp 12 Itanium quick start service 12 partner technology access centers 12 upgrades and financial incentives 12 conclusion 13 for more information 13 appendix: Itanium advantages in your computing future 14 hp’s CPU roadmap 14 Itanium processor architecture 15 predication enhances parallelism 15 speculation minimizes the effect of memory latency 15 inherent scalability delivers easy expansion 16 what this means in a server 16 2 executive The Itanium™ Processor Family is the next great stride in computing--and it’s here today.
    [Show full text]
  • IMME128M64D2SOS8AG (Die Revision E) 1Gbyte (128M X 64 Bit)
    Product Specification | Rev. 1.0 | 2015 IMME128M64D2SOS8AG (Die Revision E) 1GByte (128M x 64 Bit) 1GB DDR2 Unbuffered SO-DIMM By ECC DRAM RoHS Compliant Product Product Specification 1.0 1 IMME128M64D2SOS8AG www.intelligentmemory.com Version: Rev. 1.0, FEB 2015 1.0 - Initial release ReReRemark:Re mark: Please refer to the last page of the i) Contents ii) List of Table iii) List of Figures . We Listen to Your Comments Any information within this document that you feel is wrong, unclear or missing at all? Your feedback will help us to continuously improve the quality of this document. Please send your proposal (including a reference to this document) to: [email protected] Product Specification 1.0 2 IMME128M64D2SOS8AG www.intelligentmemory.com Features 200-Pin Unbuffered Small Outline Dual-In-Line Memory Module Capacity: 1GB Maximum Data Transfer Rate: 6.40 GB/Sec JEDEC-Standard Built by ECC DRAM Chips Power Supply: VDD, VDDQ =1.8± 0.1 V Bi-directional Differential Data-Strobe (Single-ended data-strobe is an optional feature) 64 Bit Data Bus Width without ECC Programmable CAS Latency (CL): o PC2-6400: 4, 5, 6 o PC2-5300: 4, 5 Programmable Additive Latency (Posted /CAS) : 0, CL-2 or CL-1(Clock) Write Latency (WL) = Read Latency (RL) -1 Posted /CAS On-Die Termination (ODT) Off-Chip Driver (OCD) Impedance Adjustment Burst Type (Sequential & Interleave) Burst Length: 4, 8 Refresh Mode: Auto and Self 8192 Refresh Cycles / 64ms Serial Presence Detect (SPD) with EEPROM SSTL_18 Interface Gold Edge Contacts 100% RoHS-Compliant Standard Module Height: 30.00mm (1.18 inch) Product Specification 1.0 3 IMME128M64D2SOS8AG www.intelligentmemory.com ECC DRAM Introduction Special Features (ECC ––– Functionality) - Embedded error correction code (ECC) functionality corrects single bit errors within each 64 bit memory-word.
    [Show full text]
  • Non-ECC Unbuffered DIMM Non-ECC Vs
    Quick Guide to DRAM for Industrial Applications 2019 by SQRAM What’s the difference between DRAM & Flash? DRAM (Dynamic Random Access Memory) and Flash are key components in PC systems, but they are different types of semiconductor products with different speeds/capacity/power-off data storage. High Small Type DRAM Flash Cache data transfer through Location close to CPU PCIe BUS Speed/Cost IC Density low high SRAM Capacity Module 32~64GB 2~8TB Capacity DRAM by ns Speed by ms (faster than flash) non-volatile memory: Power-Off volatile memory: data can be stored if NAND Flash data will lost if powered off Status powered off Low Large What are the features of DRAM? High Data Processing Speed Volatile Memory Extremely fast with low latency by RAM is a type of volatile memory. nanoseconds(10-9) access time. Much faster than It retains its data while powered on, but the data will HDD or SSD data speeds. vanish once the power is off. CPU DRAM HDD Extremely Fast Transfer Higher Capacity, Better Performance DRAM is closely connected to the CPU with short DRAM of higher capacity can process more data to access time. The system performance will drop if increase system performance. The more data data is processed directly by storage without DRAM. processed by DRAM, the less HDD processing time. What are the DDR specifications (DDR, DDR2, DDR3, DDR4)? The prefetch length of DDR SDRAM is 2 bits. On DDR2 the prefetch length is increased to 4 bits, and on DDR3 and on DDR DDR 4 it was raised to 8 bits and 16 bits respectively.
    [Show full text]
  • Computer Architectures
    Computer Architectures Processor+Memory Richard Šusta, Pavel Píša Czech Technical University in Prague, Faculty of Electrical Engineering AE0B36APO Computer Architectures Ver.1.00 2019 1 The main instruction cycle of the CPU 1. Initial setup/reset – set initial PC value, PSW, etc. 2. Read the instruction from the memory PC → to the address bus Read the memory contents (machine instruction) and transfer it to the IR PC+l → PC, where l is length of the instruction 3. Decode operation code (opcode) 4. Execute the operation compute effective address, select registers, read operands, pass them through ALU and store result 5. Check for exceptions/interrupts (and service them) 6. Repeat from the step 2 AE0B36APO Computer Architectures 2 Single cycle CPU – implementation of the load instruction lw: type I, rs – base address, imm – offset, rt – register where to store fetched data I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU A RD Instr. A2 RD2 SrcB AluOut Data ReadData Memory Memory A3 Reg. WD3 File WD 15:0 Sign Ext SignImm B35APO Computer Architectures 3 Single cycle CPU – implementation of the load instruction lw: type I, rs – base address, imm – offset, rt – register where to store fetched data I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 Write on rising edge of CLK RegWrite = 1 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU A RD Instr.
    [Show full text]
  • ECE 571 – Advanced Microprocessor-Based Design Lecture 17
    ECE 571 { Advanced Microprocessor-Based Design Lecture 17 Vince Weaver http://web.eece.maine.edu/~vweaver [email protected] 3 April 2018 Announcements • HW8 is readings 1 More DRAM 2 ECC Memory • There's debate about how many errors can happen, anywhere from 10−10 error/bit*h (roughly one bit error per hour per gigabyte of memory) to 10−17 error/bit*h (roughly one bit error per millennium per gigabyte of memory • Google did a study and they found more toward the high end • Would you notice if you had a bit flipped? • Scrubbing { only notice a flip once you read out a value 3 Registered Memory • Registered vs Unregistered • Registered has a buffer on board. More expensive but can have more DIMMs on a channel • Registered may be slower (if it buffers for a cycle) • RDIMM/UDIMM 4 Bandwidth/Latency Issues • Truly random access? No, burst speed fast, random speed not. • Is that a problem? Mostly filling cache lines? 5 Memory Controller • Can we have full random access to memory? Why not just pass on CPU mem requests unchanged? • What might have higher priority? • Why might re-ordering the accesses help performance (back and forth between two pages) 6 Reducing Refresh • DRAM Refresh Mechanisms, Penalties, and Trade-Offs by Bhati et al. • Refresh hurts performance: ◦ Memory controller stalls access to memory being refreshed ◦ Refresh takes energy (read/write) On 32Gb device, up to 20% of energy consumption and 30% of performance 7 Async vs Sync Refresh • Traditional refresh rates ◦ Async Standard (15.6us) ◦ Async Extended (125us) ◦ SDRAM -
    [Show full text]
  • DDR ECC Reference Design to Improve Memory Reliability in 66Ak2gx-Based Systems
    TI Designs: TIDEP-0070 DDR ECC Reference Design to Improve Memory Reliability in 66AK2Gx-Based Systems TI Designs Features This reference design describes system considerations • 32-bit DDR3L Interface With Optional 4-bit ECC for for the DDR-SDRAM memory interface with Error High-Reliability System Designs Correcting Code (ECC) support in high-reliability • Flexible System Configurations With DDR ECC applications based on the 66AK2Gx Multicore DSP + ® • Built-In Read-Modify-Write (RMW) Hardware ARM System-on-Chip (SoC). System interfaces, Supporting ECC Operation With Non-Aligned board hardware, software, throughput performance, Access and diagnostic procedures, are discussed. Detailed description about the DDR interface is available in the • Minimum Performance Impact device Technical Reference Manual (TRM). • Implemented and tested on EVMK2G Hardware and Supported in Processor SDK for K2G Design Resources Applications TIDEP0070 Design Folder • Automotive Audio Amplifiers 66AK2G02 Product Folder • Home Audio 66AK2G12 Product Folder K2G General Purpose • Professional Audio EVM Tool Folder EVM • Power Protection Processor SDK for K2G Download Software • Industrial Communications and Controls ASK Our E2E Experts 66AK2G DDREMIF Control Address Data Device_0 … … Device_n Device_ECC An IMPORTANT NOTICE at the end of this TI reference design addresses authorized use, intellectual property matters and other important disclaimers and information. TIDUBO4B–April 2016–Revised June 2018 DDR ECC Reference Design to Improve Memory Reliability in 66AK2Gx-
    [Show full text]
  • Managing Correctable Memory Errors on Cisco UCS Servers
    Managing Correctable Memory Errors on Cisco UCS Servers This document provides empirical evidence that shows no correlation between correctable and uncorrectable errors on UCS M4 and earlier generation servers. Furthermore, using industry-standard benchmarks, this document demonstrates that systems with correctable errors do not exhibit system performance degradation. Given these findings, starting with UCS Manager (UCSM) 2.2(7), 3.1(1), and Cisco Integrated Management Controller (CIMC) 2.0(9) for rack standalone, Cisco UCS server memory-error threshold policies will not declare modules with correctable errors to be degraded. For customers who are not on UCSM 2.2(7), 3.1(1), or CIMC 2.0(9) or newer that experience a degraded memory alert for correctable errors, the Cisco UCS team recommends that memory modules with correctable errors not be replaced immediately upon alert. Instead please reset the memory-error counters and resume operation. See the Additional Resources section for UCS M5 servers. © 2020 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 9 Contents Field Recommendations: Correctable Errors and Threshold Policies................................................................ 3 Overview of Memory Errors .................................................................................................................................... 3 Classification of Memory Errors ...........................................................................................................................
    [Show full text]
  • HP Z220 CMT and HP Z220 SFF Memory Configurations And
    HP recommends Windows® 7. HP Z220 CMT and HP Z220 SFF Memory Configurations and Optimization The purpose of this document is to provide an overview of the memory configuration for the HP Z220 CMT and HP Z220 SFF Workstation and to provide recommendations to optimize performance. 1 Supported memory modules The types of memory supported on a HP Z220 CMT and HP Z220 SFF Workstations are: • 2 GB, 4 GB and 8 GB PC3-12800U 1600MHz DDR3 Unbuffered non-ECC DIMMs • 2 GB, 4 GB and 8 GB PC3-12800E 1600MHz DDR3 Unbuffered ECC DIMMs • 1.35V and 1.5V DIMMs, but the system will operate the DIMMs, safely, at 1.5V only. • Single and dual rank 2 Gb and 4 Gb based DIMMs See the Memory Technology White Paper for additional technical information. Figure 1 Processor and memory support Supports Non-ECC Supports Processor Notes Memory ECC Memory Intel® Core™ i3-32xx X X Intel® Core™ i3-21xx X X Intel® Core™ i5-34xx X * Intel® Core™ i5-35xx X * Intel® Core™ i7-37xx X * Xeon® E3-12xx v2 X X Pentium® G6xx X X * If ECC memory is added to the system, the ECC function is disabled and the DIMMs will appear to the system as Non-ECC memory. Platform capabilities Maximum capacity: 32 GB • HP Z220 has a total of 4 memory sockets • 2 channels with 2 sockets per channel Speed • 1600 MHz, 1333 MHz and 1066 MHz DIMMs are supported • Memory will operate at the speed of the slowest rated installed processor or DIMM. The processor determines what type of memory is supported on the HP Z220.
    [Show full text]