Microarchitecture-Aware Physical Planning for Deep Submicron Technology Mongkol Ekpanyapong

Microarchitecture-Aware Physical Planning for Deep Submicron Technology A Thesis Presented to The Academic Faculty by Mongkol Ekpanyapong In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy School of Electrical and Computer Engineering Georgia Institute of Technology May 2006 Microarchitecture-Aware Physical Planning for Deep Submicron Technology Approved by: Asst. Professor Sung Kyu Lim, Advisor Professor Abhijit Chatterjee School of Electrical and School of Electrical and Computer Engineering Computer Engineering Georgia Institute of Technology, Adviser Georgia Institute of Technology Professor Sudhakar Yalamanchili Asst. Professor Gabriel H. Loh School of Electrical and College of Computing Computer Engineering Georgia Institute of Technology Georgia Institute of Technology Asst. Professor Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology Date Approved Novermber 14, 2005 Dedicated to my mother, Anongrath Saesua and my father, Veerachai Ekpanyapong iii ACKNOWLEDGEMENTS I would like to express my sincere gratitude to Professor Sung Kyu Lim for his guidance of my research and his patience during my Ph.D. study at Georgia Tech. I would like to thank my thesis committee members, Professor Sudhakar Yalamanchili, Professor Hsien-Hsin Sean Lee, Professor Abhijit Chatterjee, Prof. Gabriel H. Loh for their valuable suggestions, especially Professor Yalamanchili and Professor Lee who served me as my reading committee members. I also would like to thank my thesis proposal committee member, Professor Alan Doolittle for his valuable suggestion. I would like to thank all the members of GTCAD and CREST groups for their support and friendship, especially, Jacob Minz, Michael B. Healy, Thaisiri Watewai, Ismail F. Baskaya, J.C. Park, Chinnakrishnan S. Ballapuram, Kiran Puttaswamy, Rodric Rabbah, Charles Hardnett, Eric Wong , and Mohit Pathak who helped me in many ways throughout my Ph.D. career. I would like to special thank to my friend Pinar Korkmaz for her help in proofreading this dissertation and her suggestions throughout my Ph.D. career. I also would like to thank Professor Krishna Palem, Professor P. S. Thiagarajan, and Professor Kanchana Kanchanasut for giving me an opportunity to join the program. I would like to thank Professor Weng-Fai Wong who teaches me many things in my first year at Georgia Tech. My family has always supported me in all my decisions and for that I express my deepest gratitude and love. I also would like to thank my best friend Patcharee Basu for her love, support, and understanding. iv TABLE OF CONTENTS DEDICATION . iii ACKNOWLEDGEMENTS . iv LIST OF TABLES . viii LIST OF FIGURES . ix SUMMARY . xiii I INTRODUCTION . 1 1.1 Problem Statement . 3 1.2 Contributions . 4 1.3 Thesis Organization . 6 II ORIGIN AND HISTORY OF THE PROBLEM . 8 2.1 Background . 8 2.2 Microarchitecture Floorplanning . 9 2.3 Retiming . 11 III MICROARCHITECTURAL FLOORPLANNING . 15 3.1 Wire Delay Issues . 15 3.2 Problem Formulation . 17 3.2.1 Design Flow . 17 3.2.2 Problem Formulation . 18 3.3 Profile-guided Floorplanning . 20 3.3.1 Mixed Integer Programming Model . 20 3.3.2 Linear Programming Model . 23 3.4 Simulation Infrastructure . 27 3.5 Experimental Results . 31 3.6 Summary . 34 IV 3D MICROARCHITECTURAL FLOORPLANNING . 38 4.1 Problem Formulation . 38 4.1.1 Design Flow . 38 v 4.1.2 Problem Formulation . 40 4.2 3D Thermal Analysis . 40 4.2.1 3D Power Analysis . 40 4.2.2 3D Thermal Analysis . 41 4.3 Simulation Infrastructure . 42 4.4 Thermal-aware 3D Partitioning and Floorplanning . 43 4.4.1 3D Partitioning . 43 4.4.2 MILP (Mixed Integer Linear Program)-based 3D Floorplanning . 43 4.4.3 Linear Relaxation . 45 4.5 Experimental Results . 48 4.6 Summary . 52 V MICROARCHITECTURAL DESIGN SPACE EXPLORATION . 53 5.1 Impact of process technology and challenges . 55 5.2 Adaptive Search . 56 5.2.1 Gradient Search . 58 5.2.2 Smart Start . 60 5.2.3 Priority Search . 60 5.2.4 Search Pruning . 61 5.2.5 Gradient First-order Ratio . 61 5.3 Experimental Results . 62 5.3.1 Experimental Results for Design Space Exploration . 62 5.3.2 Experimental Results for Microarchitecture Enhancement . 66 5.4 Summary . 68 VI MICROARCHITECTURE-AWARE CIRCUIT PLACEMENT WITH RE- TIMING . 73 6.1 Problem Formulation . 74 6.2 Adaptive Network Characterization . 76 6.2.1 Observations . 76 6.2.2 Adaptive Network Characterization Methodology . 84 6.2.3 AGEO Experimental Results . 86 6.3 GEO-PD Algorithm . 87 vi 6.3.1 GEO-PD Experimental Results . 90 6.4 Summary . 91 VII MICROARCHITECTURE-AWARE RETIMING AND VOLTAGE AS- SIGNMENT . 96 7.1 Methodology . 97 7.1.1 Low Power Retiming . 97 7.1.2 ILP-based Voltage Scaling . 99 7.1.3 Linear Programming Relaxation . 101 7.1.4 Voltage Mapping . 104 7.1.5 Post Refinement . 108 7.2 Experimental Results . 111 7.3 Conclusions . 114 VIIIMICROARCHITECTURE-AWARE CIRCUIT PLACEMENT WITH STA- TISTICAL RETIMING . 115 8.1 Statistical Longest Path Analysis . 116 8.1.1 Statistical Bellman-Ford Algorithm . 116 8.1.2 Limitation of the Bellman-Ford Extensions . 121 8.1.3 Statistical Bellman-Ford Algorithm . 123 8.2 Global Placement with Statistical Retiming . 124 8.2.1 Modeling Delay Distribution . 124 8.2.2 Statistical Retiming based Timing Analysis . 125 8.2.3 Bounds on Target Clock Period . 127 8.2.4 Mincut-based Constructive Placement . 128 8.2.5 Retiming Delay Distribution . 128 8.3 Experimental Results . 129 8.4 Summary . 131 IX CONCLUSION . 133 REFERENCES . 135 PUBLICATIONS . 145 vii LIST OF TABLES 1 Microarchitecture configurations used in the study. 27 2 Access latency range (in clock cycles) among the microarchitecture modules. 31 3 Runtime breakdown in seconds . 34 4 Brute force search space . 63 5 Running Time comparison . 64 6 Initial condition used by Smart Start . 64 7 Correlation of Microarchitecture Parameters . 66 8 Gradient First-order Ratio (GFR) . 66 9 Benchmark circuit characteristics . 77 10 Performance Improvement Result . 87 11 Benchmark circuit characteristics . 90 12 Comparison among ESC, GEO, GEO-P, and GEO-PD on 8x8 floorplanning. Each algorithm reports wirelength, retiming delay (Dr), normal delay (Dn), visible power (Pv) and total power (Pt). 92 13 Impact of retiming objective. The total power includes the power consumed by FFs. 112 14 Voltage scaling result for the RVS algorithm. 113 15 Final power distribution . 113 16 Total power comparison among several voltage scaling algorithms. Retiming and post refinement are performed for all five algorithms. 114 17 Benchmark circuit characteristics . 130 18 The comparison between the deterministic Monte Carlo simulation, the eSBF, and the kSBF on 8x8 dimension . 131 viii LIST OF FIGURES 1 VLSI Design Process . ..

Microarchitecture-Aware Physical Planning for Deep Submicron Technology Mongkol Ekpanyapong

Sustaining Moore's Law Through Inexactness

An Efficient Memory Operations Optimization Technique for Vector Loops on Itanium 2 Processors

School of Electrical and Computer Engineering 2000-2001 Annual Report

Flexible Error Handling for Embedded Real-Time Systems – Operating System and Run-Time Aspects –

Lecture Notes in Computer Science 1742 Edited by G

Hardware and Software for Approximate Computing

Overcoming the Power Wall by Exploiting Application Inexactness and Emerging COTS Architectural Features: Trading Precision for Improving Application Quality 1

GRADUATE STUDY in COMPUTER SCIENCE Rice University Graduate

An Optimum Inexact Design for an Energy Efficient Hearing

Exploiting Errors for Efficiency: a Survey from Circuits to Algorithms

Error-Efficient Computing Systems

Georgia Tech Research Horizons -- Winter 2004