A Multi-Paradigm C++-Based Hardware Description Language

A MULTI-PARADIGM C++-BASED HARDWARE DESCRIPTION LANGUAGE A Thesis Presented to The Academic Faculty by Chad D. Kersey In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Electrical and Computer Engineering Georgia Institute of Technology December 2019 Copyright c 2020 by Chad D. Kersey A MULTI-PARADIGM C++-BASED HARDWARE DESCRIPTION LANGUAGE Approved by: Saibal Mukhopadhyay, Committee Chair Tushar Krishna School of Electrical and Computer School Electrical and Computer Engineering Engineering Georgia Institute of Technology Georgia Institute of Technology Sudhakar Yalamanchili, Advisor Hyesoon Kim School of Electrical and Computer School of Electrical and Computer Engineering Engineering Georgia Institute of Technology Georgia Institute of Technology Thomas Conte Richard Vuduc College of Computing School of Computational Science and Georgia Institute of Technology Engineering Georgia Institute of Technology Date Approved: 09/30/2019 For Sudha, without whom none of this would have happened. iii ACKNOWLEDGEMENTS The work that eventually became this dissertation started at the end of 2012, though the influences leading to its creation reach back much further, and the number of people whose input shaped it is staggering. It is difficult to work on something for so many years without owing quite a bit of gratitude. I am sure I will not be able to name every contribution here, but I will attempt to name the names who still come most prominently to mind. Much of the work that would become this dissertation was done in collaboration with the Structural Simulation Toolkit team at Sandia National Laboratories' Computer Science Research Institute and/or funded by the US Department of Energy. Arun Rodriguez es- pecially has been instrumental in his role as a mentor through five separate internships at Sandia, beginning when C++ 11 was still the upcoming C++ 0x standard and ending after C++ 14 had been published. CHDL would have likely been built using another language if it had not been for my work with SST. My many lab mates in Georgia Tech's Computer Architecture and Systems Lab were helpful and supportive, both in matters technical and mundane. Nawaf Almoosa, Eric Anger, Xinwei Chen, Dhruv Choudhary, Greg Diamos, Naila Farooqui, Minhaj Hassan, Andrew Kerr, Si Li, Adam McLaughlin, Karthik Rao, Ifrah Saeed, William Song, Blaise Tine, Jin Wang, Haicheng Wu, and Hugh Xiao all contributed to making my time at Georgia Tech enjoyable and memorable and participated in technical discussions that strengthened the work presented here. Meghana Gupta in particular contributed important work to the HARP architecture used in Harmonica, including contributing a compiler back-end. Jeff Young has been instrumental in the final stages of writing and preparing this document, and his feedback has been a steady pressure forcing text into what would otherwise be voids. My advisor throughout nearly all of this work and for years before it was the late Sudhakar Yalamanchili, whose patient guidance pushed me along this program of research. I would be hard pressed to name a single person who has had more influence on my work iv or who has shaped my career more. Finally, I must acknowledge all of the people who contributed much needed community and camaraderie, including Brian Akers, Josh Cowan, Ryan and Emily Curtin, and John Frey. My wife Ashley and our little trembly mutt Sarah, both of whom will surely be quite happy to spend an evening with me in which I am not staring at a laptop screen, have been a dependable, stabilizing force in my life and a great source of emotional support throughout a long program of study, and for that I cannot thank them enough. v TABLE OF CONTENTS DEDICATION ...................................... iii ACKNOWLEDGEMENTS .............................. iv LIST OF TABLES ................................... x LIST OF FIGURES .................................. xi SUMMARY ........................................ xiii I INTRODUCTION ................................. 1 1.1 Summary of Contributions . .6 II RELATED WORK ................................ 9 2.1 Fixed-Paradigm Hardware Description Systems . 11 2.1.1 Traditional Hardware Description Languages . 11 2.1.2 Data Flow Systems . 12 2.1.3 High-level Synthesis . 14 2.2 Extensible Hardware Description Systems . 17 2.2.1 Simulation and Verification Oriented Systems . 17 2.2.2 Generative Hardware Description Languages . 18 2.3 Extensible Generative Systems . 19 2.3.1 Extending Generative HDLs . 21 2.4 Motivation for Present Work . 21 III DESIGN OF CHDL ................................ 23 3.1 Introduction . 23 3.2 Synchronous Design . 24 3.3 Structure of CHDL . 26 3.3.1 Template Metaprogramming . 27 3.4 CHDL Core Library . 28 3.4.1 Basic Hardware Manipulation . 28 3.4.2 Optimization . 36 3.4.3 Node Types . 40 3.4.4 Logic Functions Library . 43 vi 3.4.5 Inputs, Outputs, and Simulation Interfaces . 47 3.5 CHDL Template Library . 50 3.5.1 Aggregate Data Types: ag and un .................. 50 3.5.2 Directed Data Types: directed, in, out, inout ......... 51 3.5.3 Loadable Module Support . 51 3.5.4 IP Block Generators . 52 3.5.5 Memory System Interfaces and Components . 53 3.5.6 Numeric Types and Operations . 53 3.5.7 RTL Description in CHDL . 54 3.6 Introspection . 55 3.6.1 Applications of Netlist Introspection . 56 3.7 Conclusions . 65 IV THE HARMONICA DATA PARALLEL CORE .............. 66 4.1 Iqyax RISC Core . 66 4.2 Introduction . 66 4.3 Background and Overview . 68 4.3.1 Hybrid Memory Cube (HMC) . 68 4.3.2 Architectural Constraints . 69 4.4 The HARP Architectures . 70 4.4.1 Execution Model . 71 4.4.2 Parameterization . 71 4.4.3 Instruction Set Features . 72 4.4.4 Kernel Launch Model . 75 4.4.5 Exception Support . 76 4.5 The Harmonica Microarchitecture . 76 4.5.1 Pipeline Detail . 77 4.6 Tool Flow . 79 4.6.1 Software Flow . 80 4.6.2 Hardware Flow . 82 4.7 Performance Analysis . 82 4.7.1 Benchmarks . 83 vii 4.7.2 Results . 84 4.7.3 Impact of Cache . 88 4.8 Context . 89 4.9 Conclusion . 90 V GUARDED ATOMIC ACTIONS IN CHDL ................ 91 5.1 The GAA Paradigm . 93 5.2 An Implementation of GAA in CHDL . 93 5.2.1 gaareg<T>: A Templated GAA Register Type . 94 5.2.2 Rule(): Expressing GAA Rules . 95 5.2.3 C++ Classes as Modules . 95 5.2.4 Scheduler Generator . 96 5.3 Applications Implemented Using CHDL GAA . 98 5.3.1 GCD: Euclidean Algorithm . 99 5.3.2 Other Examples . 101 5.4 Conclusions . 102 VI CHEETAH: A PIPELINED HLS ENVIRONMENT WITHIN CHDL 104 6.1 Pipeline-Oriented Hardware Description Language . 106 6.2 Pipeline Model . 107 6.3 Pipeline Description Language API . 109 6.3.1 Pipeline Stages . 109 6.3.2 Pipeline Control Flow . 110 6.3.3 Pipeline-Carried Values . 110 6.4 An Example: Mandelbrot Set Visualization . 111 6.4.1 Constant Declarations . 112 6.4.2 Pipeline Variable Declarations . 112 6.4.3 Spawn Loop . 113 6.4.4 Room for Retiming . 114 6.4.5 Main Iteration Logic . 114 6.4.6 Serialization of Results . 115 6.5 Applicability to Instruction Set Processors . 116 viii 6.5.1 Example Processor Design . 116 6.5.2 Signals . 117 6.5.3 Instruction Fetch . 118 6.5.4 Register File and Scoreboard . 119 6.5.5 Dispatch and Branch Resolution . 120 6.5.6 Pipelined Functional Units . 122 6.5.7 FIFO Input/Output . 123 6.5.8 Register Writeback . 123 6.5.9 Operation . 124 6.5.10 Applicability to Harmonica and Other SIMT Core Designs . 124 VII CONCLUSIONS .................................. 125 7.1 Thesis Contributions Revisited . 125 7.2 Future Directions . 128 7.3 Concluding Remarks . 129 APPENDIX A | HARP INSTRUCTION SET ............... 130 REFERENCES ..................................... 143 VITA ............................................ 148 ix LIST OF TABLES 1 A list of hardware description systems discussed academically taxonomized by levels of abstraction supported and extensibility. 10 2 Utility functions for instantiating memory. 35 3 Logical operations and overloads. 43 4 Bit vector operations and overloads. 44 5 Memory system components provided by the CHDL template library. 53 6 Numeric types defined in the CHDL template library. 53 7 Selected instructions from the HARP instruction sets. 72 8 Some of the calls in the HARP runtime library. 81 9 The benchmarks used in our evaluation of the Harmonica.

Load more