Design of Coarse-Grained Power Gating for a Fine-Grained Many-Core Processor Array

DESIGN OF COARSE-GRAINED POWER GATING FOR A FINE-GRAINED MANY-CORE PROCESSOR ARRAY by Shylesh Umapathy A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Engineering, Computer Engineering Option in the Lyles College of Engineering California State University, Fresno May 2018 ii APPROVED For the Department of Electrical and Computer Engineering We, the undersigned, certify that the thesis of the following student meets the required standards of scholarship, format, and style of the university and the student's graduate degree program for the awarding of the master's degree. Shylesh Umapathy Aaron Stillmaker (Chair) Electrical and Computer Engineering Reza Raeisi Electrical and Computer Engineering Nan Wang Electrical and Computer Engineering For the University Graduate Committee: Dean, Division of Graduate Studies iii AUTHORIZATION FOR REPRODUCTION OF MASTER’S THESIS Yes, I grant permission for the reproduction of this thesis in part or in its entirety without further authorization from me, on the condition that the person or agency requesting reproduction absorbs the cost and provides proper acknowledgment of authorship. Permission to reproduce this thesis in part or in its entirety must be obtained from me. Signature of thesis author: iv ABSTRACT DESIGN OF COARSE-GRAINED POWER GATING FOR A FINE-GRAINED MANY-CORE PROCESSOR ARRAY With the 53rd commemoration of Moore’s law and transistor sizing heading towards 4 nm, the number of transistors on an integrated circuit continue to double every year. However, there are many factors limiting this growth rate such as power consumption, which is a serious impediment for design of high-speed, low-power integrated circuits. In modern semiconductor manufacturing, leakage power in high- performance processors accounts for 20-30% of the total power. Power gating is one approach to reduce the power consumption of an integrated circuit by effectively disconnecting the power supply from blocks during idle mode and is commonly used in the design of commercial high-end processors and in SoC for portable applications. In this thesis, coarse-grained power gating techniques are explored to reduce the power consumption of a fine-grained many-core processor array. This work provides a detailed analysis and comparison of the design tradeoffs between using the ring and grid methods of power gates placement. The RTL to GDSII flow was performed using Synopsys EDA tools and NanGate FreePDK45 standard open cell library. The results depict that, on a test 128-bit MAC unit, placing power gates using the grid methodology created an 8.52% less IR drop but an 8-9% congestion increase when compared to the ring methodology. The leakage power of the 128-bit MAC unit was reduced by 99.98% with both the grid and ring placement of power gates when compared to a design without power gates. Shylesh Umapathy May 2018 ACKNOWLEDGMENTS Foremost, I would like to express my sincere gratitude to my advisor Dr. Aaron Stillmaker for the continuous support during my thesis, for his patience, immense knowledge, and motivation. He has always been there for my help and without his guidance, this thesis and report wouldn’t have been possible. I was fortunate enough to work under his guidance and I would cherish this experience for a long time. I am very thankful to my committee members Dr. Reza Raeisi, who has always inspired me with his leadership qualities and has always been there to help a student in every possible way. I wouldn’t have had a better understanding of Computer Architecture, if I hadn’t worked with Dr. Nan Wang, his seminars have helped me to sharpen my presentation skills. I am also very thankful to all the faculty members of Electrical and Computer Engineering for all the help. I would like to thank my friends, especially Meet Shah, who has always helped me, whenever I ran into a trouble while installing and configuring the Synopsys tools. I would also like to thank Brian Cardwell, who had helped me to generate all the great plots in this report. Finally, I must express my very profound gratitude to my family for providing me with unfailing support and continuous encouragement throughout my years of study. TABLE OF CONTENTS Page LIST OF TABLES ......................................................................................... viii LIST OF FIGURES .......................................................................................... ix CHAPTER 1: INTRODUCTION .................................................................... 11 CHAPTER 2: BACKGROUND ...................................................................... 14 Low Power Architectures ......................................................................... 16 Power Gating ............................................................................................ 21 Related Work ............................................................................................ 26 CHAPTER 3: TOOLS AND LIBRARIES ...................................................... 33 NanGate FreePDK45 ................................................................................ 34 CHAPTER 4: METHODOLOGY ................................................................... 36 Library preparation for NanGate FreePDK45 .......................................... 36 Designing 128-bit MAC unit .................................................................... 37 Synthesizing gate-level netlist .................................................................. 38 Place and Route Physical Design ............................................................. 40 Header design in Custom Compiler .......................................................... 50 HSPICE simulation of Power gates .......................................................... 52 CHAPTER 5: RESULTS AND ANALYSIS .................................................. 53 ModelSim Simulation ............................................................................... 53 Place and Route ........................................................................................ 53 CHAPTER 6: CONCLUSION AND FUTURE WORK ................................. 66 Conclusion ................................................................................................ 66 Future Work .............................................................................................. 67 APPENDICES ....................................................................................................... 73 vii vii Page APPENDIX A: Library Compiler and Milkyway ................................................. 74 APPENDIX B: TLUplus File ................................................................................ 76 APPENDIX C: Verilog Code ................................................................................ 79 APPENDIX D: Design Compiler Script ................................................................ 81 APPENDIX E: IC Compiler Script ........................................................................ 84 APPENDIX F: HSPICE Deck ............................................................................... 88 APPENDIX G: Detailed Timing Report ................................................................ 90 APPENDIX H: Detailed Congestion Report ....................................................... 100 LIST OF TABLES Page Table 1. Impact of scaling on ARM processor MIPS/mW [4] ........................ 11 Table 2. Fine-grained power gating techniques [18] ....................................... 27 Table 3. Comparison between NEMS and CMOS header switch [24] ............ 32 Table 4. Leakage Power of Power gates .......................................................... 52 Table 5. Impact of power gates placed in grid manner .................................... 58 Table 6. Impact of power gates placed in ring manner .................................... 60 Table 7. Impact of power gates on voltage and current ................................... 62 LIST OF FIGURES Page Figure 1. Header and Footer power gates [17] ................................................ 22 Figure 2. Grid of switches [19] ........................................................................ 23 Figure 3. Ring of switches [19] ........................................................................ 24 Figure 4. Leakage Power reduction [20] .......................................................... 28 Figure 5. Power reduction based on gating [20] .............................................. 28 Figure 6. Power reduction per core in AMD x86-64 [12] ............................... 29 Figure 7. Noise due to power gating [21] ........................................................ 29 Figure 8. Incremental scheduling [21] ............................................................. 30 Figure 9. Energy saved based on temperature [22] .......................................... 30 Figure 10. Impact of temperature on Leakage Power [23] .............................. 31 Figure 11. 128-bit MAC unit ........................................................................... 38 Figure 12. Block diagram of the 128-bit MAC unit in Design Compiler ........ 39 Figure 13. Detailed circuit diagram in Design Compiler ................................. 39 Figure 14. Voltage domain .............................................................................. 41 Figure 15. Voltage rails .................................................................................... 41 Figure 16. Power gates

Load more