DESTINY: a Tool for Modeling Emerging 3D NVM and Edram Caches

DESTINY: A Tool for Modeling Emerging 3D NVM and eDRAM caches Matt Poremba¶, Sparsh Mittal, Dong Li, Jeffrey S. Vetter§, Yuan Xie ¶Pennsylvania State University Oak Ridge National Laboratory §Georgia Institute of Technology University of California at Santa Barbara Email: [email protected],{mittals,lid1,vetter}@ornl.gov,[email protected] Abstract—The continuous drive for performance has pushed design approaches. The existing modeling tools model only a the researchers to explore novel memory technologies (e.g. non- subset of memory technologies, for example CACTI [4] and volatile memory) and novel fabrication approaches (e.g. 3D stack- its extension can model SRAM, eDRAM and DRAM but not ing) in the design of caches. However, a comprehensive tool which NVMs, and NVSim [5] models only 2D designs of SRAM models both conventional and emerging memory technologies and NVMs but not eDRAM. As an increasing number of for both 2D and 3D designs has been lacking. We present commercial designs utilize 3D stacking [6, 7], architecture and DESTINY, a microarchitecture-level tool for modeling 3D (and 2D) cache designs using SRAM, embedded DRAM (eDRAM), spin system level research on 3D stacking has become even more transfer torque RAM (STT-RAM), resistive RAM (ReRAM) and important. A few 3D modeling tools exist such as CACTI- phase change RAM (PCM). DESTINY facilitates design-space 3DD [8] and 3DCacti [9], however, they do not model NVMs. exploration across several dimensions, such as optimizing for Also, 3DCACTI has not been updated to support technology a target (e.g. latency or area) for a given memory technology, nodes below 45nm. Since different tools use different mod- choosing the suitable memory technology or fabrication method eling frameworks and assumptions, comparing the estimates (i.e. 2D v/s 3D) for a desired optimization target etc. DESTINY obtained from different tools may be incorrect. Thus, a single, has been validated against industrial cache prototypes. We believe validated tool which can model both 2D and 3D designs using that DESTINY will drive architecture and system-level studies prominent memory technologies is lacking. Due to this, several and will be useful for researchers and designers. architecture-level studies on 3D caches derive their parameters Keywords—Cache, SRAM, eDRAM, STT-RAM, ReRAM, PCM, using a linear extrapolation of 2D parameters which may be non-volatile memory (NVM or NVRAM), modeling tool, validation. inaccurate or sub-optimal. In this paper, we present DESTINY1,a3Ddesign-space I. INTRODUCTION exploration tool for SRAM, eDRAM and non-volatile memory. DESTINY utilizes the 2D circuit-level modeling framework of Recent trends of increasing system core-count and mem- NVSim for SRAM and NVMs. Also, it utilizes the coarse- and ory bandwidth bottleneck have necessitated use of large size fine-grained TSV (through silicon via) models from CACTI- on-chip caches. For example, Intel’s Ivytown processor has 3DD. Further, DESTINY adds the model of eDRAM (Section 37.5MB SRAM LLC [1]. To overcome the limitations of II-A) and two additional types of 3D designs (Section II-B). SRAM, viz. high leakage power and low density, researchers Overall, DESTINY enables modeling of both 2D and 3D and designers have explored alternate memory technologies, designs of five memory technologies (SRAM, eDRAM and such as eDRAM, STT-RAM, ReRAM and PCM. These tech- three NVMs). Also, it is able to model technology nodes nologies enable design of large size caches, for example, ranging from 22nm to 180nm. Clearly, DESTINY provides IBM’s 22nm POWER8 processor uses 96MB L3 eDRAM comprehensive modeling and design space exploration capa- cache [2]. In parallel, research has also been directed to novel bility which is not provided by any of the existing tools. fabrication techniques such as 3D integration that enables vertical stacking of multiple layers [3]. 3D stacking provides We have compared the results from DESTINY against several benefits such as high density and higher flexibility in several commercial prototypes [6, 7, 10–14] to validate 2D routing signals, power and clock. design of eDRAM and 3D designs of SRAM, eDRAM and ReRAM in DESTINY (Section III). We observe that the Lack of comprehensive and validated modeling tools, how- modeling error is less than 10% for most cases and less than ever, hinders full study of emerging memory technologies and 20% for all cases. This can be considered reasonable for an academic modeling tool and is also in range with the errors M. Poremba and S. Mittal are co-first authors. Support for this work was provided by U.S. Department of Energy, Office produced by previous tools [5]. of Science, ASCR. This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC0500OR22725 with the U.S. Department of Energy. DESTINY provides the capability to explore a large design M. Poremba and Y. Xie are supported in part by NSF 1218867, 1213052, space which provides important insights and is also useful for 1409798 and DoE DE-SC0005026. The United States Government retains and early stage estimation of emerging memory technologies. For the publisher, by accepting the article for publication, acknowledges that the example, while it may be straightforward to deduce the optimal United States Government retains a non-exclusive, paid-up, irrevocable, world- wide license to publish or reproduce the published form of this manuscript, or memory technology for some parameters (e.g. the technology allow others to do so, for the United States Government purposes. This work was performed when M. Poremba was working as a graduate student intern 1The source-code of DESTINY can be downloaded from the following git at ORNL. repository: https://code.ornl.gov/3d cache modeling tool/destiny.git 978-3-9815370-4-8/DATE15/c 2015 EDAA 1543 with smallest cell size is likely to have lowest area), this whereby all subarrays are refreshed in parallel, row-by-row. may not be easy for other parameters such as read/write EDP This approach provides the benefit that the refresh operations (energy delay product), since they depend on multiple factors. do not significantly reduce the availability of banks to service Clearly, use of a tool such as DESTINY is imperative for full requests. DESTINY can also be extended to model other design space exploration and optimization refreshing schemes such as refreshing the mats in parallel. From performance and feasibility perspective, eDRAM cache II. MODELING FRAMEWORK designs which provide bank availability (i.e., the percentage of time where the bank is not refreshing) below a threshold DESTINY is designed to be a comprehensive tool able to are not desired and hence, they are discarded by DESTINY. model multiple memory technologies. Figure 1 shows a high- Since retention period of eDRAM varies exponentially with level diagram of DESTINY. DESTINY framework utilizes the the temperature [18], DESTINY scales the retention period 2D circuit-level model of NVSim, which was extended to accordingly to model the effect of the temperature. DESTINY model 2D eDRAM and 3D design of SRAM, eDRAM and computes the refresh latency, energy, and power which are monolithic NVMs. For a given memory technology, the device- provided as the output of the tool. level parameters (e.g. cell size, set-voltage, reset voltage) are fed to DESTINY. Then, possible configurations are generated which are passed to the circuit-level modeling code. For 3D B. 3D Model designs, 3D modeling is also done and the configurations Several flavors of 3D stacking have been explored in the include different number of 3D layers (e.g. 1, 2, 4, 8, 16 etc.). literature, such as face-to-face, face-to-back, and monolithic The designs which are physically infeasible are considered as stacking [3]. In all these approaches except monolithic stack- invalid and are discarded, for example, if the refresh period ing, dies are bonded using various techniques (e.g., wafer-to- of an eDRAM design is greater than its retention period, it wafer, die-to-wafer, or die-to-die bonding). These approaches is considered invalid. This reduces the number of possible differ in terms of their effect on die testing and yield. Wafer-to- options. The remaining configurations are passed through an wafer can potentially reduce yield by bonding a dysfunctional optimization filter which selects the optimal configuration die anywhere in the stack. Die-to-wafer and die-to-die can based on a target such as least read latency or least area etc. reduce this by testing individual dies, although it has adverse effect on alignment. Input DESTINY Framework Latency Cache Size, 3D, The most common 3D stacking is known as face-to-back Optimizations Configuration Valid Optimiz- Energy … bonding. In this form, through silicon vias (TSVs) are used to Generator Results ation penetrate the bulk silicon and connect the first metal layer (the SRAM Filter Leakage ReRAM back) to the top metal layer (the face) of a second die. In face- Circuit-level Area to-face bonding, the top metal layer of one die is directly fused STTRAM 3D Model 2D Modeling PCM to the top metal layer of a second die. Monolithic stacking does Refresh eDRAM not use TSVs at all. Instead, monolithically stacked dies build devices on higher metal layers connected using normal metal Fig. 1: High-level overview of DESTINY framework. layer vias if necessary. DESTINY also allows design space exploration across Each approach has its own advantages and disadvantages. multiple memory technologies, for example, finding an optimal Face-to-back must carefully consider placement and avoid technology for a given metric/target. For such use cases, transistors when being formed through the bulk silicon while device-level parameters for multiple memory technologies can face-to-face does not. Therefore, face-to-face has the potential be fed as input to DESTINY (shown in left of Figure 1).

Load more