High-Level Synthesis of Control and Memory Intensive Applications
Total Page:16
File Type:pdf, Size:1020Kb
High-Level Synthesis of Control and Memory Intensive Applications Peeter Ellervee Stockholm 2000 Thesis submitted to the Royal Institute of Technology in partial fulfillment of the requirements for the degree of Doctor of Technology Ellervee, Peeter High-Level Synthesis of Control and Memory Intensive Applications ISRN KTH/ESD/AVH--2000/1--SE ISSN 1104-8697 TRITA-ESD-2000-01 Copyright © 2000 by Peeter Ellervee Royal Institute of Technology Department of Electronics Electronic System Design Laboratory Electrum 229, Isafjordsgatan 22-26 S-164 40 Kista, Sweden URL: http://www.ele.kth.se/ESD Abstract Recent developments in microelectronic technology and CAD technology allow production of larger and larger integrated circuits in shorter and shorter design times. At the same time, the abstraction level of specifications is getting higher both to cover the increased complexity of systems more efficiently and to make the design process less error prone. Although High-Level Synthesis (HLS) has been successfully used in many cases, it is still not as indispensable today as layout or logic synthesis. Various application specific synthesis strategies have been devel- oped to cope with the problems of general HLS strategies. In this thesis solutions for the following sub-tasks of HLS, targeting control and memory inten- sive applications (CMIST) are presented. An internal representation is an important element of design methodologies and synthesis tools. IRSYD (Internal Representation for System Description) has been developed to meet various requirements for internal representations. It is specifically targeted towards representation of heterogeneous systems with multiple front-end languages and to facilitate the integration of several design and validation tools. The memory access bandwidth is one of the main design bottlenecks in control and data-transfer dominated applications such as protocol processing and multimedia. These applications are of- ten characterized by the use of dynamically allocated data structures. A methodology has been developed to minimize not only the total number of data-transfers by merging memory accesses, but also to minimize the total storage by reducing bit-width wastage. An efficient control-flow based scheduling strategy, segment-based scheduling, has been devel- oped to avoid the typical disadvantages of most of the control-flow based schedulers. The scheduler avoids construction of all control paths by dividing control graph into segments dur- ing the graph traversal. Segmentation drastically reduces the number of possible paths thus speeding up the whole scheduling process. Fast and simple heuristic algorithms have been developed which solve the allocation and bind- ing tasks of functional units and storage elements in a unified manner. The algorithms solve a graph coloring problem by working on a weighted conflict graph. A prototype tool set has been developed to test HL S methodology of CMIST style applications. Some industrially relevant design examples have been synthesized by using the tool set. The effectiveness of the methodologies was validated by comparing the results against synthesis re- sults of commercially available HLS tools and in some cases against manually optimized de- signs. To my parents. Acknowledgements I would like to begin by expressing my gratitude to Professor Hannu Tenhunen for providing me with the opportunity to do research in a very interesting area. I would also like to send huge thanks to my supervisor, Docent Ahmed Hemani, for providing me with the challenging tasks and problems to solve, and for the help and support. Of the other people who work here at Electronic System Design Laboratory I would like to thank Dr. Axel Jantsch for his support and for is invaluable comments about various research topics. Dr. Johnny Öberg for keeping up a constant competition between us, for all the help and, especially, for all the fuzzy discussions. Prof. Anshul Kumar and Prof. Shashi Kumar of Indian Institute of Technology in New Delhi for their extremely valuable collaboration, for the suggestions and discussions, which signifi- cantly helped to improve various parts of this work. Prof. Francky Catthoor and Miguel Miranda from IMEC, Leuven, for their help and cooperation in the memory synthesis area. Dr. Mehran Mokhtari from the high-speed group for all the interesting discussions and for con- vincing me “that transistor can work”. Prof. Andres Keevallik and Prof. Raimund Ubar from Tallinn Technical University for introducing to me the sea of research in the area of digital hard- ware. Dr. Kalle Tammemäe from TTU for the help and cooperation. Special thanks to Hans Berggren, Julio Mercado and Richard Andersson at the system group for keeping all the computers up and running, to Aila Pohja for keeping an eye on things when she was here, and to Lena Beronius for keeping an eye on things around here now. I would also like to thank all colleagues at ESDlab, at the high-speed group, at TTU, and all my friends for their support and friendship. Last but not least I would like to thank my parents for their care, and for the inspiring environ- ment they offered to me and to my brother. Kista, February 2000 Peeter Ellervee i ii Table of Contents 1. Introduction 1 1.1. High-level synthesis 1 1.2. HLS of control dominated applications 4 1.3. HLS sub-tasks targeted in the thesis 6 1.4. Conclusion and outline of the thesis 9 2. Prototype High-Level Synthesis Tool “xTractor” 11 2.1. Introduction 11 2.2. Synthesis flow 13 2.3. Target architecture 15 2.4. Component programs 20 2.5. Synthesis and analysis steps 21 2.6. Conclusion and future work 25 3. IRSYD - Internal Representation for System Description 27 3.1. Introduction 27 3.2. Existing Internal Representations 29 3.3. Features required in a unifying Internal Representation 33 3.4. IRSYD concepts 35 3.5. Execution model 40 3.6. Supporting constructions and concepts 45 3.7. Conclusion 50 4. Pre-packing of Data Fields in Memory Synthesis 51 4.1. Introduction 51 4.2. Related work 52 4.3. Memory organization exploration environment 53 4.4. Pre-packing of basic-groups - problem definition and overall approach 55 4.5. Collecting data dependencies 57 4.6. Compatibility graph construction 62 4.7. Clustering 64 4.8. Results of experiments 67 4.9. Conclusion 70 5. Segment-Based Scheduling 71 5.1. Introduction 71 5.2. Related work 72 5.3. Control flow/data flow representation and overview of scheduling process 74 5.4. Traversing CFG 76 iii 5.5. Scheduling loops 81 5.6. Rules for state marking 84 5.7. Results of experiments 86 5.8. Conclusion 88 6. Unified Allocation and Binding 91 6.1. Introduction 91 6.2. Conflict graph construction 92 6.3. Heuristic algorithms 97 6.4. Results of experiments 103 6.5. Conclusion 104 7. Synthesis Examples 105 7.1. Test case: OAM part of the ATM switch 105 7.2. F4/F5: Applying data field pre-packing 108 7.3. xTractor - synthesis results 118 8. Thesis Summary 123 8.1. Conclusions 123 8.2. Future Work 125 9. References 127 Appendix A. Command line options of component tools 137 Appendix B. IRSYD - syntax in BNF 141 Appendix C. IRSYD - C++ class structure 155 iv List of Publications High-Level Synthesis 1. A. Hemani, B. Svantesson, P. Ellervee, A. Postula, J. Öberg, A. Jantsch, H. Tenhunen, “High-Level Synthesis of Control and Memory Intensive Communications System”. Eighth Annual IEEE International ASIC Conference and Exhibit (ASIC’95), pp.185-191, Austin, USA, Sept. 1995. 2. B. Svantesson, A. Hemani, P. Ellervee, A. Postula, J. Öberg, A. Jantsch, H. Tenhunen, “Modelling and Synthesis of Operational and Management System (OAM) of ATM Switch Fabrics”. The 13th NORCHIP Conference, pp.115-122, Copenhagen, Denmark, Nov. 1995. 3. B. Svantesson, P. Ellervee, A. Postula, J. Öberg, A. Hemani, “A Novel Allocation Strategy for Control and Memory Intensive Telecommunication Circuits”. The 9th International Conference on VLSI Design, pp.23-28, Bangalore, India, Jan. 1996. 4. J. Öberg, J. Isoaho, P. Ellervee, A. Jantsch, A. Hemani, “A Rule-Based Allocator for Im- proving Allocation of Filter Structures in HLS”. The 9th International Conference on VLSI Design, pp.133-139, Bangalore, India, Jan. 1996. 5. P. Ellervee, A. Hemani, A. Kumar, B. Svantesson, J. Öberg, H. Tenhunen, “Controller Syn- thesis in Control and Memory Centric High Level Synthesis System”. The 5th Biennial Baltic Electronic Conference, pp.397-400, Tallinn, Estonia, Oct. 1996. 6. P. Ellervee, A. Kumar, B. Svantesson, A. Hemani, “Internal Representation and Behav- ioural Synthesis of Control Dominated Applications”. The 14th NORCHIP Conference, pp.142-149, Helsinki, Finland, Nov. 1996. 7. P. Ellervee, A. Kumar, B. Svantesson, A. Hemani, “Segment-Based Scheduling of Control Dominated Applications in High Level Synthesis”. International Workshop on Logic and Architecture Synthesis, pp.337-344, Grenoble, France, Dec. 1996. 8. J. Öberg, P. Ellervee, A. Kumar, A. Hemani, “Comparing Conventional HLS with Gram- mar-Based Hardware Synthesis: A Case Study”. The 15th NORCHIP Conference, pp.52- 59, Nov. 1997, Tallinn, Estonia. 9. P. Ellervee, S. Kumar, A. Hemani, “Comparison of Four Heuristic Algorithms for Unified v Allocation and Binding in High-Level Synthesis”. The 15th NORCHIP Conference, pp.60- 66, Tallinn, Estonia, Nov. 1997. 10. P. Ellervee, S. Kumar, A. Jantsch, A. Hemani, B. Svantesson, J. Öberg, I. Sander, “IRSYD - An Internal representation for System Description (Version 0.1)”. TRITA-ESD-1997-10, Royal Institute of Technology, Stockholm, Sweden. 11. A. Jantsch, S. Kumar, I. Sander, B. Svantesson, J. Öberg, A. Hemani, P. Ellervee, M. O’Nils, “Comparison of Six Languages for System Level Descriptions of Telecom Sys- tems”. First International Forum on Design Languages (FDL’98), vol.2, pp.139-148, Lau- sanne, Switzerland, Sept. 1998. 12. P. Ellervee, S. Kumar, A. Jantsch, B. Svantesson, T.