Modeling and Leveraging Emerging Non-Volatile Memories for Future Computer Designs

The Pennsylvania State University The Graduate School College of Engineering MODELING AND LEVERAGING EMERGING NON-VOLATILE MEMORIES FOR FUTURE COMPUTER DESIGNS A Dissertation in Computer Science and Engineering by Xiangyu Dong c 2011 Xiangyu Dong Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy December 2011 The dissertation of Xiangyu Dong was reviewed and approved* by the following: Yuan Xie Associate Professor of Computer Science and Engineering Dissertation Advisor, Chair of Committee Mary Jane Irwin Professor of Computer Science and Engineering Vijaykrishnan Narayanan Professor of Computer Science and Engineering Suman Datta Professor of Electrical Engineering Norman P. Jouppi Director of Intelligent Infrastructure Lab at Hewlett-Packard Labs Special Member Mahmut Taylan Kandemir Professor of Computer Science and Engineering Director of Graduate Affairs *Signatures are on file in the Graduate School. Abstract Energy efficiency has become a major constraint in the design of computing systems today. As CMOS continues scaling down, traditional CMOS scaling theory requires to reduce supply and threshold voltages in proportion to device sizes, which exponentially increases the leakage. As a result, leakage power has become comparable to dynamic power in current- generation processes. Before leakage power becomes the dominant part in the power budget, disruptive emerging technologies are needed. Fortunately, many new types of non-volatile memory technologies are now evolving. For example, emerging non-volatile memories such as Spin-Torque-Transfer RAM (MRAM, STTRAM), Phase-Change RAM (PCRAM), and Resistive RAM (ReRAM) show their at- tractive properties of high access performance, low access energy, and high cell density. Therefore, it is promising to facilitate these emerging non-volatile memory technologies in designing future high-performance and low-power computing systems. However, since none of these new non-volatile memories is mature yet, academic research is necessary to demonstrate the usefulness of these technologies. In order to do that, this dissertation investigates three aspects of facilitating these emerging non-volatile memory technologies. First, a circuit-level performance, energy, and area models for various non- volatile memories is built. Second, several architecture-level techniques that mitigate the drawbacks in non-volatile memory write operations are proposed and evaluated. Third, application-level case studies of adopting these emerging technologies are conducted. iii Contents List of Figures ix List of Tables xiii List of Symbols xv Acknowledgments xvi Dedication xvii 1 Introduction 1 2 Technology Background 5 2.1 NAND Flash Switching Mechanism . 5 2.2 STTRAMSwitchingMechanism . 6 2.3 PCRAMSwitchingMechanism . 8 2.4 ReRAMSwitchingMechanism . 9 2.5 eNVMReadOperations ............................. 10 2.6 eNVMProblems ................................. 10 2.6.1 WriteLatency/EnergyIssue. 11 2.6.2 WriteEnduranceIssue. .. .. .. .. .. .. .. 11 3 Related Work 12 3.1 PreviousWorkonCircuitLevel . 12 iv 3.2 Previous Work on Architecture Level . ..... 13 3.3 Previous Work on Application Level . 13 4 Circuit-Level: eNVM Modeling 15 4.1 NVSimProjectOverview . .. .. .. .. .. .. .. .. 15 4.2 NVSimFramework................................ 17 4.2.1 DeviceModel............................... 17 4.2.2 ArrayOrganization............................ 17 4.2.3 MemoryBankType ........................... 18 4.2.4 ActivationMode ............................. 19 4.2.5 RoutingtoMats ............................. 20 4.2.6 RoutingtoSubarrays .......................... 23 4.3 AreaModel .................................... 24 4.3.1 CellAreaEstimation........................... 25 4.3.2 Peripheral Circuitry Area Estimation . 28 4.4 TimingandPowerModels. .. .. .. .. .. .. .. .. 30 4.4.1 Generic Timing and Power Estimation . 30 4.4.2 DataSensingModels........................... 31 4.4.3 CellSwitchingModel .......................... 34 4.5 MiscellaneousCircuitry . 35 4.5.1 PulseShaper ............................... 35 4.5.2 ChargePump............................... 36 4.6 ValidationResult................................ 36 4.6.1 NAND Flash Validation . 36 4.6.2 STT-RAMValidation . .. .. .. .. .. .. .. 37 4.6.3 PCRAMValidation ........................... 37 4.6.4 ReRAMValidation............................ 38 4.7 Summary ..................................... 38 v 5 Architecture-Level: Techniques for Alleviating eNVM Write Overhead 40 5.1 Directly Replacing SRAM Caches . 40 5.2 Read-PreemptiveWriteBuffer . 42 5.3 HybridSRAM-eNVMCache........................... 43 5.4 Effectiveness of Read-Preemptive Write Buffer and Hybrid Cache...... 44 5.5 Summary ..................................... 46 6 Application-Level: eNVM for File Storage 47 6.1 Multi-LevelCell ................................. 47 6.1.1 ExtraWriteOverhead .......................... 48 6.1.2 ExtraReadOverhead .......................... 49 6.1.3 ReducedCellLifetime .......................... 49 6.1.4 PCRAMLifetimeModel. 49 6.2 AdaptiveMLC/SLCPCRAMArrayStructure . 51 6.2.1 MLC/SLCWrite: SET,RESET,andPGMPulses . 51 6.2.2 MLC/SLC Read: Dual-Mode Sense Amplifier . 51 6.2.3 AddressRe-mapping........................... 53 6.2.4 Reconfigurable PCRAM-based Solid-State Disk . ..... 55 6.3 ExperimentalResults. 55 6.3.1 PCRAMMLC/SLCTimingModel . 56 6.3.2 Performance-Aware Management Result . 57 6.3.3 Performance-CostAnalysis . 58 6.3.4 LifetimeAnalysis............................. 58 6.4 Summary ..................................... 59 7 Application-Level: eNVM for Exascale Fault Tolerance 60 7.1 Problem...................................... 60 7.2 Integrating eNVM Modules into MPP Systems . 64 7.3 Local/Global Hybrid Checkpoint . 67 7.3.1 HybridCheckpointScheme . 67 vi 7.3.2 System Failure Category Analysis . 69 7.3.3 Theoretical Performance Model . 71 7.4 ExperimentalResults. 74 7.4.1 CheckpointingScenarios . 74 7.4.2 ScalingMethodology........................... 75 7.4.3 PerformanceAnalysis .......................... 76 7.4.4 PowerAnalysis .............................. 77 7.5 Summary ..................................... 78 8 Application-Level: eNVM as On-Chip Cache 79 8.1 Overview ..................................... 79 8.2 ReRAM-Based Cache Wear-Leveling . 81 8.2.1 Inter-Set Cache Line Wear-Leveling . 81 8.2.2 Intra-Set Cache Line Wear-Leveling . 83 8.2.3 Endurance Requirements for ReRAM Caches . 84 8.3 Circuit-LevelReRAMModel . 84 8.3.1 ReRAMModeling ............................ 84 8.3.2 ReRAMArrayDesignSpectrum . 86 8.4 Architecture-Level Model of Memory Hierarchy Design . .......... 86 8.4.1 Feed-Forward Network . 87 8.4.2 Training and Validation . 88 8.5 ExperimentalMethodology . 89 8.5.1 Circuit-Architecture Joint Exploration Framework . ......... 90 8.5.2 SimulationEnvironment . 91 8.6 Design Exploration and Optimization . ..... 94 8.6.1 Cache Hierarchy Design Exploration . 94 8.6.2 DesignOptimization ........................... 98 8.6.3 Discussion................................. 101 8.7 Summary ..................................... 103 vii 9 Conclusion 105 Bibliography 106 viii List of Figures 1.1 While the state-of-the-art memory hierarchy includes SRAM, DRAM, NAND flash, and HDD, we propose an eNVM-based new hierarchy that only uses non-volatile memory technology and provides both high performance and low-poweroperations. .............................. 2 2.1 The basic string block of NAND flash, and the conceptual view of floating gate flash memory cell (BL=bitline, WL=wordline, SG=select gate). 6 2.2 The conceptual view of an STTRAM cell. ... 7 2.3 The schematic view of a PCRAM cell with a MOSFET selector transistor (BL=bitline, WL=wordline, SL=sourceline). ..... 8 2.4 The temperature-time relationship during SET and RESET operations. 8 2.5 TheworkingmechanismofReRAMcells. ... 9 4.1 An example of the memory array organization modeled in NVSim: a hi- erarchical memory organization includes banks, mats, and subarrays with decoders, multiplexers, sense amplifiers, and output drivers.......... 17 4.2 The example of the wire routing in a 4x4 mat organization for the data array of a 8-way 1MB cache with 64B cache lines. 21 4.3 The example of the wire routing in a 4x4 mat organization for the tag array of a 8-way 1MB cache with 64B cache lines. 21 4.4 An example of mat using internal sensing and H-tree routing......... 23 4.5 An example of mat using external sensing and bus-like routing. 24 4.6 Conceptual view of a MOS-accessed cell (1T1R) and its connected word line, bitline,andsourceline. .. .. .. .. .. .. .. .. 25 ix 4.7 Conceptual view of a cross-point cell array without diode (0T1R) and its connectedwordlinesandbitlines. 27 4.8 The layout of the NAND-string cell modeled in NVSim. ....... 28 4.9 Transistor sizings: (a) latency-optimized; (b) balanced; (c) area-optimized. 29 4.10 Analysis model for current sensing scheme. ........ 32 4.11 Analysis model for current-in voltage sensing scheme. ............ 32 4.12 Analysis model for voltage-divider sensing scheme. ............ 33 4.13 The current-voltage converter modeled in NVSim. ......... 34 4.14 The circuit schematic of the slow quench pulse shaper used in [1]. 35 5.1 In a read-preemptive write buffer, the read operations can be granted over the ongoing write operation if the progress of that write operation is less than 50%......................................... 42 5.2 The performance impact of the preemption condition. ......... 43 5.3 STTRAM write intensity with and without hybrid SRAM-STTRAM

Load more