Improving Energy-Efficiency in Dynamic Memories Through Retention Failure Detection
Total Page:16
File Type:pdf, Size:1020Kb
Received January 22, 2019, accepted February 22, 2019, date of publication February 27, 2019, date of current version March 13, 2019. Digital Object Identifier 10.1109/ACCESS.2019.2901738 Improving Energy-Efficiency in Dynamic Memories Through Retention Failure Detection ROBERT GITERMAN 1, ROMAN GOLMAN2, AND ADAM TEMAN 2 1Telecommunications Circuits Laboratory, Institute of Electrical Engineering, EPFL, 1015 Lausanne, Switzerland 2Emerging Nanoscale Integrated Circuits and Systems Laboratories, Faculty of Engineering, Bar-Ilan University, Ramat Gan 5290002, Israel Corresponding author: Robert Giterman (robert.giterman@epfl.ch) This work was supported by the Israel Science Foundation under Grant 996/18 and Grant 2181/18. ABSTRACT A gain-cell embedded DRAM (GC-eDRAM) is an attractive logic-compatible alternative to the conventional static random access memory (SRAM) for the implementation of embedded memories, as it offers higher density, lower leakage, and two-ported operation. However, it requires periodic refresh cycles to maintain its data which deteriorates due to leakage. The refresh-rate, which is traditionally set according to the worst cell in the array under extreme operating conditions, leads to a significant refresh power consumption and decreased memory availability. In this paper, we propose to reduce the cost of GC-eDRAM refresh by employing failure detection to lower the refresh-rate. A 4T dynamic complementary dual-modular redundancy bitcell is proposed to offer per-bit error detection, resulting in a substantial decrease in the refresh-rate and over 60% power reduction compared with the SRAM. The proposed approach is also compared with the conventional SRAM and GCeDRAM implementations with integrated error correction codes, demonstrating significant area and latency reductions. INDEX TERMS Error detection and correction, error correcting codes, SRAM, gain cells, logic-compatible eDRAM, GC-eDRAM, low power. I. INTRODUCTION periodic refresh operations to maintain the data, which is As the functionality and complexity of integrated circuit (IC) stored on a parasitic MOSFET gate capacitor. While gain-cell systems continue to grow, embedded memories are of great implementations in mature technology nodes were shown to interest in modern multiprocessors and other VLSI system- provide sufficiently high data retention times (DRTs) [6], [7], on-chips (SoCs) [1]. The mainstream solution to the imple- gain-cell embedded DRAM (GC-eDRAM) implementations mentation of embedded memories is based on six-transistor in 65 nm nodes and below demonstrate significantly lower (6T) static random access memory (SRAM) macrocells due DRTs due to increased leakage currents and decreased para- to their high speed and process scalability. However, 6T sitic storage capacitances [8], [9]. Furthermore, the refresh- SRAM incur a large area penalty and suffer from high static rate of GC-eDRAM is commonly set according to the power consumption, often dominating the power budget and worst DRT of all cells in a memory array, simulated under real-estate of these VLSI systems [1], [2]. Moreover, tech- worst-case biasing conditions and PVT variations [6]–[9]. As nology scaling has led to a continuous increase in paramet- a result, the retention power of these memories becomes dom- ric variations, resulting in reduced SRAM noise margins, inated by the refresh power component, and the availability which ultimately limit the voltage scaling capabilities of these of the memory to processor access is lowered due to frequent memories [3], [4]. refresh cycles. Logic-compatible gain-cell embedded DRAM The DRT of GC-eDRAM arrays highly depends on the (GC-eDRAM) is an attractive alternative to conventional sub-threshold (sub-VT) leakage of MOS transistors, which SRAM, as it offers higher density due to a smaller bitcell is exponentially impacted by threshold voltage (VT) fluctu- size, lower leakage, and two-ported operation [5]. However, ations due to process variations and device mismatch. As the dynamic nature of gain-cells requires the application of a result, the DRT distribution of GC-eDRAM arrays has a long tail, with the average DRT found to be a few orders-of- The associate editor coordinating the review of this manuscript and magnitude higher than the minimum DRT of the array [10]. approving it for publication was Tae Hyoung Kim. This phenomenon has gotten worse with technology scaling 2169-3536 2019 IEEE. Translations and content mining are permitted for academic research only. VOLUME 7, 2019 Personal use is also permitted, but republication/redistribution requires IEEE permission. 27641 See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. R. Giterman et al.: Improving Energy-Efficiency in Dynamic Memories Through Retention Failure Detection due to increased parametric variations [11]. Furthermore, B. OUTLINE the DRT is measured under worst-case operating sequences, The rest of this paper is organized as follows: SectionII which are highly unlikely, if not infeasible [12]. Therefore, demonstrates the cost of refresh in terms of the power and substantial energy savings can be achieved by refreshing availability of GC-eDRAM arrays; Section III introduces when needed and not at a predefined rate, set according to concept of DRT extension using error detection; SectionIV a worst-case DRT based on the first failing bitcell, under the evaluates the error detection realization using ECC; SectionV worst possible operating conditions and sequences. demonstrates how refresh-rate relaxation can be achieved Recent work has proposed trading off data accuracy using CDMR GC-eDRAM with inherent per-bit error detec- (i.e., bitcell failure rate) with power consumption in cases, tion; SectionVI presents the implementation of the proposed where the underlying application has a certain degree of memory and compares it with other memory solutions with error tolerance [11], [13]. This concept can be extended to integrated ECC; and Section VIII concludes the paper. non-error tolerant memories that have a back-up copy of their data, such as low-level caches that employ write-through II. COST OF HIGH REFRESH-RATE policy [14]–[17]. Based on the fact that the application is Total power consumption is generally measured as a combi- either error tolerant or that a backup copy exists, we propose nation of dynamic and static power components. In the case of to postpone the refresh operation until one or more errors static memories, such as SRAM, the dynamic power compo- are detected. This provides a means to significantly increase nent (Pdyn) is consumed during read and write accesses, while the energy-efficiency and availability of GC-eDRAM arrays, the static power (Pstat) is due to leakage during standby (reten- while bounding the error for error-tolerant applications [18], tion) periods. However, when discussing dynamic memories, or signaling that the data is corrupted and should be recov- such as GC-eDRAM, data retention requires the application ered. An example of such an operation could be signaling of periodic refresh operations. Therefore, the static power of a miss in a write-through cache, which would initiate data dynamic memories is redefined as retention power (Pret) that recovery from a higher level cache. is composed of both leakage and refresh power components. As an efficient implementation of such an approach, A refresh operation includes reading out each row and writing we propose using a 4-transistor complementary dual-modular it back once every refresh period, such that the total power can redundancy (CDMR) gain-cell with inherent per-bit error be written as: detection capabilities, previously proposed for space applica- P D P C P D P C f (E C E ); (1) tions due to its enhanced soft-error tolerance [19]. The inher- ret leak refresh leak refresh write read ent error-detection scheme does not incur a latency overhead where Pleak is the average leakage power of the array, Prefresh or additional hardware, as required by conventional error cor- is the refresh power, Ewrite and Eread are the write and read rection code (ECC) schemes, and therefore, it is advantageous energies, respectively, and frefresh is the refresh-rate of the for the implementation of memories that require high speed memory. Consequently, lowering the refresh-rate leads to operation, such as low-level caches. The proposed approach extensive power savings, especially due to the fact that the achieves over 60% energy savings by allowing up-to 5 pre- leakage power component of (1) is often substantially lower sumed DRT errors, with a 15% smaller cell size than a con- than the refresh power component [11]. ventional 6T SRAM cell. The energy savings can be further A different aspect to the cost of refresh-rate is the availabil- increased by tolerating a higher number of DRT errors. ity of the array to processor operations, since during refresh operations, the memory cannot be accessed. Defining array A. CONTRIBUTIONS availability (Av) as the percentage of time that the memory is The main contributions of this paper are summarized as available for external access leads to: follows: Av(%) D (1 − N (t C t )f ) × 100; (2) 1) This work presents a methodology for refresh-rate rows write read refresh reduction through error detection, based on the large where Nrows is the number of rows, and twrite and tread are the spread of DRT variation in deeply-scaled process write and read access times, respectively. Again, this leads nodes. to the conclusion that refresh-rate is expensive; this time, 2) We analyze error detection realization using ECC, and in terms of array availability. provide a qualitative analysis of the power, area, and To understand the need for the application of refresh, Fig.1 latency overheads of this approach. shows a conventional two-transistor (2T) gain-cell, which 3) We propose using a CDMR technique, implemented stores its data as charge on a parasitic capacitor (CSN). The using a 4-transistor GC-eDRAM bitcell with per-bit stored charge leaks away over time, primarily through the error detection, to relax the refresh-rate.