Architectural Techniques to Enhance DRAM Scaling

Architectural Techniques to Enhance DRAM Scaling Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering Yoongu Kim B.S., Electrical Engineering, Seoul National University Carnegie Mellon University Pittsburgh, PA June, 2015 Abstract For decades, main memory has enjoyed the continuous scaling of its physical substrate: DRAM (Dynamic Random Access Memory). But now, DRAM scaling has reached a thresh- old where DRAM cells cannot be made smaller without jeopardizing their robustness. This thesis identifies two specific challenges to DRAM scaling, and presents architectural techniques to overcome them. First, DRAM cells are becoming less reliable. As DRAM process technology scales down to smaller dimensions, it is more likely for DRAM cells to electrically interfere with each other’s operation. We confirm this by exposing the vulnerability of the latest DRAM chips to a reliability problem called disturbance errors. By reading repeatedly from the same cell in DRAM, we show that it is possible to corrupt the data stored in nearby cells. We demonstrate this phenomenon on Intel and AMD systems using a malicious program that generates many DRAM accesses. We provide an extensive characterization of the errors, as well as their behavior, using a custom-built testing platform. After examining various potential ways of addressing the problem, we propose a low-overhead solution that effec- tively prevents the errors through a collaborative effort between the DRAM chips and the DRAM controller. Second, DRAM cells are becoming slower due to worsening variation in DRAM process technology. To alleviate the latency bottleneck, we propose to unlock fine-grained parallelism within a DRAM chip so that many accesses can be served at the same time. We take a close look at how a DRAM chip is internally organized, and find that it is divided ii into small partitions of DRAM cells called subarrays. Although the subarrays are mostly independent, they occasionally rely upon some global circuit components that force the subarrays to be operated one at a time. To overcome this limitation, we devise a series of non-intrusive changes to DRAM architecture that increases the autonomy of the subarrays and allows them to be accessed concurrently. We show that such parallelism across subarrays provides large performance gains at low cost. Lastly, we present a powerful DRAM simulator that facilitates the design space ex- ploration of main memory. Unlike previous simulators, our simulator is easy to modify, allowing DRAM architectural changes to be modeled quickly and accurately. This is why our simulator is able to provide out-of-the-box support for a wide array of contemporary DRAM standards. Our simulator is also the fastest, outperforming the next fastest simulator by more than a factor of two. iii Acknowledgments First and foremost, I would like to thank my advisor, Professor Onur Mutlu. He embod- ies the quotation from Quintilian, the Roman rhetorician who said, “We should not write so that it is possible for the reader to understand us, but so that it is impossible for him to misunderstand us.” He always strived for the highest level of clarity and thoroughness in research, and propelled me to become a better thinker, writer, and presenter. He gave me the opportunities, the resources, and the guidance that allowed me to be who I am today. This thesis was only made possible by his encouragement and nurturing throughout all of my years in the SAFARI research group. I am grateful to the members of my thesis committee: Professor Onur Mutlu, Profes- sor James Hoe, Professor Todd Mowry, and Professor Trevor Mudge. They truly cared about my research and me as an individual. I also thank Professor Mor Harchol-Balter who introduced me to research, and taught me how to use precise language and high-level abstractions in technical prose. I am grateful to my internship mentors, who gave me the freedom to pursue my ideas, the counsel to achieve my goals, and their friendship in my times of turmoil: Chris Wilk- erson, Moin Qureshi, Sangyeun Choi, Konrad Lai, Uksong Kang, Suresh Chittor, Suneeta Sah, and Michele Franceschini. I thank the Korea Foundation for Advanced Studies, Intel Corporation, and Samsung Electronics for their generous financial support. I am grateful to Samantha Goldstein and Elaine Lawrence, ECE departmental advisors who sent me many a reminder to make sure that I turned in my paperwork on time, and iv who took a personal interest in my academic progress. During graduate school, members of the SAFARI research group have been like family to me. Justin Meza was a force for calm in my life with his even-keeled, zen-like out- look on everything and everyone. Chris Fallin was a source of inspiration for me with his computer wizardry, self discipline, and engaging insights about technology and research. Chris Craik would never fail to catch me off-guard with his dry, situational one-liners that made me burst out in laughter. Rachata Ausavarungnirun was not only a gifted chef, but was also one of the nicest persons I have ever met, and I am grateful for the countless oc- casions when he lent me a helping hand. Donghyuk Lee was our resident DRAM expert, who ever so graciously and patiently gave me indispensable guidance on many facets of my research. Hongyi Xin spiced up my dinners by introducing me to the newest Chinese restaurants, where he taught me so much about Chinese history and politics. Kevin Chang was always kind enough to tolerate my dumb jokes, and we had lots of fun hanging out together — especially when drunk. Samira Khan was a thoughtful and empathetic listener, who — with her sidekick Omar Chowdhury — could also be hilariously and mercilessly self-deprecating. Vivek Seshadri was never hesitant to offer incisive criticism that would invariably rescue me from an impasse in writing. Lavanya Subramanian is a genuinely caring person who would always look out for everyone around her. Gennady Pekhimenko tempered me in my thoughts and opinions by challenging my biases and preventing me from jumping to conclusions. Jamie Liu was a talented programmer who brought depth and clarity to every discussion. I also thank other members of the SAFARI research group for their companionship and assistance: HanBin Yoon, Ben Jaiyen, Yixin Luo, Nandita Vijaykumar, Yang Li, Kevin Hsieh, Amirali Boroumand, and Saugata Ghose. At Carnegie Mellon, I had the privilege of making many great friends. Michael Pa- pamichael blew me away with his work ethic and detail-orientedness, and I am indebted to him for the many all-nighters that we pulled together for that one paper. Michelle Good- stein and I were often the only ones to be left in the office during the late evenings, and v she was my shoulder to cry on when I was particularly stressed over a looming deadline. Eric Chung was the wise and senior graduate student whom I wanted to grow up and be like. Peter Klemperer had the most fascinating stories to tell about his escapades and mis- demeanors from a previous life. Evangelos Vlachos was a party animal whom I sorely missed after he left for Switzerland. I thank Ross Daly, Weikun Yang, Jeremie Kim, and Ji Hye Lee for their tremendous research contributions as undergraduate/masters interns. I thank Seungil Huh, Jee Eun Song, Sung Won Park, and Dongsu Han for helping me get settled into Pittsburgh when I first arrived, and Jane Seo for her care and understanding. I also thank Mei Chen, Michael Kozuch, Varun Gupta, Peter Milder, Olatunji Ruwase, Anshul Gandhi, Gabe Weisz, Alexey Tumanov, Berkin Akin, and Eugene Wang. Although our paths diverged geographically, it was comforting to know that my old buddies — Heeyoung Lee and Hongsup Shin — were always there to support me with their camaraderie and solidarity. Lastly, I would like to acknowledge the unwavering love from my family: my grandmother Worsoon Kim, my parents Hyosoon and Dohyun Kim, and my sister Taeyeun Kim. vi Contents 1 Introduction 2 1.1 The Problem: DRAM Scaling at Risk . 3 1.2 This Thesis: A Higher-Level Approach to DRAM Scaling . 4 1.2.1 Improving DRAM Reliability . 5 1.2.2 Improving DRAM Performance . 7 1.2.3 Facilitating DRAM Research . 8 1.3 Thesis Contributions . 9 1.4 Outline . 10 2 Disturbance Errors: A New Reliability Problem in DRAM 12 2.1 Disturbance Errors in Today’s DRAM Chips . 13 2.2 DRAM Background . 15 2.2.1 High-Level Organization . 15 2.2.2 Low-Level Organization . 15 2.2.3 Accessing DRAM . 16 2.2.4 Refreshing DRAM . 18 2.3 Mechanics of Disturbance Errors . 18 2.4 Real System Demonstration . 20 2.5 Experimental Methodology . 22 2.6 Characterization Results . 29 vii 2.6.1 Disturbance Errors are Widespread . 29 2.6.2 Access Pattern Dependence . 30 2.6.3 Address Correlation: Aggressor & Victim . 34 2.6.4 Data Pattern Dependence . 39 2.7 Sensitivity Results . 41 2.8 Solutions to Disturbance Errors . 43 2.8.1 Six Potential Solutions . 43 2.8.2 Seventh Solution: PARA . 45 2.9 Other Related Work . 47 2.10 Chapter Summary . 48 3 Subarray Parallelism: A High-Performance DRAM Architecture 49 3.1 Bank Conflicts Exacerbate DRAM Latency . 49 3.2 Background: DRAM Organization . 53 3.2.1 Bank: Logical Organization & Operation . 54 3.2.2 Timing Constraints . 57 3.2.3 Subarrays: Physical Organization of Banks . 57 3.3 Motivation . 60 3.4 Overview of Proposed Mechanisms . 62 3.4.1 SALP-1: Subarray-Level-Parallelism-1 . 62 3.4.2 SALP-2: Subarray-Level-Parallelism-2 . 63 3.4.3 MASA: Multitude of Activated Subarrays .

Architectural Techniques to Enhance DRAM Scaling

Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Overheads and Scaling

Fully Buffered Dimms for Server

SBA-7121M-T1 Blade Module User's Manual

All-Inclusive ECC: Thorough End-To-End Protection for Reliable Computer Memory

Identifying Valueram Memory Modules

SDRAM Memory Systems: Architecture Overview and Design Verification SDRAM Memory Systems: Architecture Overview and Design Verification Primer

A Power and Temperature Aware DRAM Architecture

Tuning IBM System X Servers for Performance

Technical Product Specification

Xserve Technology Overview January 2008 Technology Overview Xserve

FB-DIMM Spring IDF Architecture Presentation

Memory Systems Cache, DRAM, Disk