A Highly Modular Router Microarchitecture for Networks-On-Chip
Total Page:16
File Type:pdf, Size:1020Kb
A Highly Modular Router Microarchitecture for Networks-on-Chip Item Type text; Electronic Dissertation Authors Wu, Wo-Tak Publisher The University of Arizona. Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author. Download date 01/10/2021 08:12:16 Link to Item http://hdl.handle.net/10150/631277 A HIGHLY MODULAR ROUTER MICROARCHITECTURE FOR NETWORKS-ON-CHIP by Wo-Tak Wu Copyright c Wo-Tak Wu 2019 A Dissertation Submitted to the Faculty of the DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING In Partial Fulfillment of the Requirements For the Degree of DOCTOR OF PHILOSOPHY In the Graduate College THE UNIVERSITY OF ARIZONA 2019 THE UNIVERSITY OF ARIZONA GRADUATE COLLEGE As members of the Dissertation Committee, we certify that we have read the dissertation prepared by Wo-Tak Wu, titled A HIGHLY MODULAR ROUTER MICROARCHITECTURE FOR NETWORKS-ON-CHIP and recommend that it be accepted as fulfilling the dissertation requirement for the Degree of Doctor of Philosophy. Dr. Linda Powers --~-__:::::____ ---?---- _________ Date: August 7, 2018 Dr. Roman Lysecky Final approval and acceptance of this dissertation is contingent upon the candidate's submission of the final copies of the dissertation to the Graduate College. I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation requirement. _____(/2 __·...... ~"--------\;-~=--------- · __ Date: August 7, 2018 Dissertation Director: Dr. Janet Roveda 3 Acknowledgements I would like to express my gratitude to Prof. Ahmed Louri for introducing me to network- on-chip, an exciting new area in computer architecture research. Under his guidance and support, I was able to learn a great deal in this area, and we were able to publish our research results [1]. Unfortunately, Prof. Louri moved on to George Washington University, but I decided to stay here at UA. I picked up a totally different research direction in network- on-chip. This dissertation represents the results of the second half of my research career at UA. Most importantly, I would like to thank my current advisors, Profs. Janet Roveda and Linda Powers. They are truly great teachers and mentors. Without their guidance, support and encouragement, it would not be possible to finish my graduate studies at UA. I would also like to thank Prof. Roman Lysecky for serving on the dissertation committee and the written comprehensive exam committee a few years ago. On a personal note, I must thank my family, especially my wife Katie, for their unconditional love and support. Without them, this long journey would never have even started. 4 Contents List of Figures8 List of Tables 10 Abstract 11 Chapter 1 Introduction 13 1.1 Chip Multiprocessor ............................... 13 1.2 Network-on-Chip ................................. 16 1.3 Motivation..................................... 17 1.4 Contributions................................... 21 1.5 Dissertation Outline ............................... 21 Chapter 2 Network-on-Chip Basics 23 2.1 Bus-Based Interconnect.............................. 23 2.2 NoC ........................................ 25 2.2.1 Wires ................................... 26 2.2.2 Router................................... 26 2.2.3 Latency .................................. 27 2.2.4 Power ................................... 27 2.2.5 Area.................................... 28 2.3 Features...................................... 28 5 2.3.1 Bandwidth................................. 28 2.3.2 Scalability................................. 28 2.3.3 Parallel Communications......................... 29 2.3.4 Clock Frequency ............................. 30 2.3.5 Fault Tolerance.............................. 30 2.3.6 Power Consumption ........................... 31 2.3.7 System Integration............................ 31 2.3.8 Chip Layout................................ 31 2.3.9 Clock Distribution ............................ 32 2.3.10 Packet Switching............................. 32 2.3.11 Summary ................................. 32 2.4 Key NoC Characteristics............................. 33 2.4.1 Topology.................................. 34 2.4.2 Routing Algorithm............................ 36 2.4.3 Flow Control ............................... 37 2.5 Challenges..................................... 38 Chapter 3 Omega Router 40 3.1 Conventional Router ............................... 40 3.1.1 Microarchitecture............................. 41 3.1.2 Router Pipeline.............................. 43 3.2 Omega Microarchitecture............................. 44 3.2.1 Top-Level Design............................. 46 3.2.2 Exchange ................................. 47 3.2.3 Datapath ................................. 50 3.2.4 Timing................................... 52 3.2.5 Routing .................................. 53 6 3.3 Evaluations .................................... 56 3.3.1 Simulator ................................. 56 3.3.2 VLSI Design Tools ............................ 58 3.3.3 Network Configurations ......................... 58 3.3.4 Network Traffic.............................. 60 3.3.5 Simulation Platform ........................... 64 3.3.6 Running Simulations........................... 65 3.4 Experiments and Results............................. 66 3.4.1 Synthetic Traffic ............................. 67 3.4.2 PARSEC Applications.......................... 68 3.4.3 Circuit Synthesis ............................. 68 3.5 Analyses...................................... 73 3.5.1 Network Latency............................. 74 3.5.2 Network Saturation............................ 77 3.5.3 Network Throughput........................... 79 3.5.4 Area and Power.............................. 80 3.5.5 Critical Path Delay............................ 81 3.5.6 Summary ................................. 82 3.6 Related Work................................... 82 3.7 Discussion..................................... 84 Chapter 4 Circuit Implementation 86 4.1 Route Computation................................ 86 4.1.1 Inter-Router................................ 87 4.1.2 Inter-Exchange .............................. 87 4.2 Buffer ....................................... 89 4.2.1 Write.................................... 89 7 4.2.2 Read.................................... 91 4.3 Output Arbiter.................................. 91 4.4 Buffer Arbiter................................... 94 4.5 Summary ..................................... 95 Chapter 5 Buffer and Link Utilization Improvement 96 5.1 Motivation..................................... 96 5.2 Microarchitecture Enhancement......................... 97 5.2.1 Merging.................................. 98 5.2.2 Splitting.................................. 99 5.3 Evaluations .................................... 100 5.4 Results and Analysis............................... 102 5.4.1 Network Latency ............................. 102 5.4.2 Network Saturation............................ 102 5.4.3 Network Throughput........................... 103 5.4.4 Area, Power and Critical Path Delay.................. 104 5.5 Summary ..................................... 107 Chapter 6 Conclusion 108 Bibliography 112 8 List of Figures 1.1 42 Years of Microprocessor Trend Data...................... 15 1.2 Generic CMP System Configuration. ...................... 16 1.3 A network-on-chip connecting CMP and off-chip memory............ 18 1.4 Link/Buffer utilization at various widths..................... 19 2.1 A single bus connecting all processing cores................... 24 2.2 A simple RC model of a circuit.......................... 24 2.3 A 4 × 4 mesh network with four concurrent communication paths. 29 2.4 Message format................................... 34 2.5 Network topology examples............................ 35 2.6 (a) Packet traverses from Node 6 to Node 2. (b) Routing tables. 37 3.1 Conventional router microarchitecture...................... 41 3.2 Router pipeline................................... 44 3.3 Conventional router microarchitecture datapath................. 45 3.4 Omega router microarchitecture with a ring network of exchanges. 46 3.5 (a) Exchange interface. (b) Exchange internal design.............. 48 3.6 Exchange buffer arbiters.............................. 50 3.7 Omega router datapath. ............................. 51 3.8 Omega router buffer read and write operations. ................ 53 9 3.9 Time-space diagram showing how a flit traverses a series of exchanges with five types of operations. ............................. 54 3.10 Average network latencies from 6 synthetic traffic patterns, full range. 69 3.11 Average network latencies from 6 synthetic traffic patterns, up to saturation. 70 3.12 Average network throughputs in all synthetic traffic. ............. 71 3.13 Average network latencies from PARSEC applications. ............ 72 3.14 Average network latencies from 6 traffic patterns................ 74 3.15 Saturation points from 6 traffic patterns..................... 77 3.16 Performances normalized to base-1-8....................... 83 4.1 High level view of buffer.............................. 90 4.2 Write operation of buffer. ............................ 90 4.3 Read operation of buffer.............................. 91 4.4 Output arbiter high-level view. ......................... 92 4.5 Buffer