Using Compression for Energy-Optimized Memory Hierarchies
Total Page:16
File Type:pdf, Size:1020Kb
Using Compression for Energy-Optimized Memory Hierarchies By Somayeh Sardashti A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Sciences) at the UNIVERSITY OF WISCONSIN - MADISON 2015 Date of final oral examination: 2/25/2015 The dissertation is approved by the following members of the Final Oral Committee: Prof. David A. Wood, Professor, Computer Sciences Prof. Mark D. Hill, Professor, Computer Sciences Prof. Nam Sung Kim, Associate Professor, Electrical and Computer Engineering Prof. André Seznec, Professor, Computer Architecture Prof. Gurindar S. Sohi, Professor, Computer Sciences Prof. Michael M. Swift, Associate Professor, Computer Sciences © Copyright by Somayeh Sardashti 2015 All Rights Reserved i Abstract In multicore processor systems, last-level caches (LLCs) play a crucial role in reducing system energy by i) filtering out expensive accesses to main memory and ii) reducing the time spent executing in high-power states. Increasing the LLC size can improve system performance and energy by reducing memory accesses, but at the cost of high area and power overheads. In this dissertation, I explored using compression to effectively improve the LLC capacity and ultimately system performance and energy consumption. Cache compression is a promising technique for expanding effective cache capacity with little area overhead. Compressed caches can achieve the benefits of larger caches using the area and power of smaller caches by fitting more cache blocks in the same cache space. However, previous compressed cache designs have demonstrated only limited benefits due to internal fragmentation, limited tags, and metadata overhead. In addition, most previous proposals targeted improving system performance even at high power and energy overheads. In this dissertation, I propose two novel compressed cache designs that are optimized for energy: Decoupled Compressed Cache (DCC) [21] [22] and Skewed Compressed Cache (SCC) [23] . DCC and SCC tightly pack variable size compressed blocks to reduce internal fragmentation. They exploit spatial locality to track compressed blocks while reducing tag overheads by tracking super-blocks. Compared to conventional uncompressed caches, DCC and ii SCC improve the cache miss rate by increasing the effective capacity and reducing conflicts. Compared to DCC, SCC further lowers area overhead and design complexity. In addition to proposing efficient compressed cache designs, I take another step forward to study compression benefits for real applications running on real machines. Since most proposals on compressed caching are on non-existing hardware, architects evaluate those using detailed simulators with small benchmarks. So, whether cache compression would benefit real applications running on real machines is not clear. In this dissertation, I address this question by analyzing the compressibility of several real applications, including production servers of the Computer Sciences Department of UW-Madison. I show that compression could in fact be beneficial to many real applications. iii Table of Contents ABSTRACT ...................................................................................................................................................... I TABLE OF CONTENTS .............................................................................................................................. III LIST OF FIGURES ........................................................................................................................................ V LIST OF TABLES ..................................................................................................................................... VIII ACKNOWLEDGEMENTS .......................................................................................................................... IX CHAPTER 1 INTRODUCTION ................................................................................................................. 1 1.1 THE END OF DENNARD SCALING ....................................................................................................... 3 1.2 WHERE DOES ENERGY GO? ................................................................................................................ 4 1.3 CACHES AS ENERGY FILTERS ............................................................................................................. 5 1.4 THESIS CONTRIBUTIONS .................................................................................................................... 7 1.5 THESIS ORGANIZATION .................................................................................................................... 11 CHAPTER 2 COMPRESSION ALGORITHMS: BACKGROUND AND RELATED WORK ......... 12 2.1 COMPRESSION ALGORITHM CLASSIFICATIONS ................................................................................ 13 2.2 METRICS TO EVALUATE THE SUCCESS OF A COMPRESSION ALGORITHM ......................................... 20 2.3 COMPRESSION POTENTIALS ............................................................................................................. 22 CHAPTER 3 MANAGING COMPRESSED DATA IN THE MEMORY HIERARCHY: BACKGROUND AND RELATED WORK ............................................................................................................ 24 3.1 COMPRESSED CACHES: BACKGROUND AND RELATED WORK .......................................................... 25 3.2 COMPRESSED MEMORY: BACKGROUND AND RELATED WORK ........................................................ 36 CHAPTER 4 DECOUPLED COMPRESSED CACHES ........................................................................ 40 iv 4.1 OVERVIEW ....................................................................................................................................... 40 4.2 SPATIAL LOCALITY AT LLC ............................................................................................................ 43 4.3 DECOUPLED COMPRESSED CACHE: ARCHITECTURE AND FUNCTIONALITY ...................................... 45 4.4 A PRACTICAL DESIGN FOR DCC ...................................................................................................... 53 4.5 EXPERIMENTAL METHODOLOGY ..................................................................................................... 56 4.6 EVALUATION ................................................................................................................................... 59 4.7 CONCLUSIONS .................................................................................................................................. 67 CHAPTER 5 SKEWED COMPRESSED CACHES ............................................................................... 68 5.1 OVERVIEW ....................................................................................................................................... 68 5.2 SKEWED ASSOCIATIVE CACHING ..................................................................................................... 70 5.3 SKEWED COMPRESSED CACHE ........................................................................................................ 72 5.4 METHODOLOGY ............................................................................................................................... 83 5.5 DESIGN COMPLEXITIES .................................................................................................................... 85 5.6 EVALUATION ................................................................................................................................... 87 5.7 CONCLUSIONS .................................................................................................................................. 93 CHAPTER 6 ON COMPRESSION EFFECTIVENESS IN THE MEMORY HIERARCHY ............ 95 6.1 OVERVIEW ....................................................................................................................................... 95 6.2 MYTHS ABOUT COMPRESSION ......................................................................................................... 97 6.3 INFRASTRUCTURE ............................................................................................................................ 99 6.4 COMPRESSION ALGORITHMS ......................................................................................................... 105 6.5 MYTHBUSTERS: TESTING MYTHS ON COMPRESSION ..................................................................... 107 6.6 CONCLUSIONS ................................................................................................................................ 128 CHAPTER 7 CONCLUSIONS ................................................................................................................ 129 7.1 DIRECTIONS FOR FUTURE WORK ................................................................................................... 130 REFERENCES ............................................................................................................................................. 136 v List of Figures Figure 1-1: Communication vs. computation energy [72]. ............................................................................................ 4 Figure 2-1: Compression ratio of different compression algorithms. .......................................................................... 22 Figure 3-1: Limits of previous compressed caches. ....................................................................................................