Hardware Synthesis from a Stream-Processing Functional Language
Total Page:16
File Type:pdf, Size:1020Kb
Hardware Synthesis from a Stream-Processing Functional Language Simon Frankau St. John's College A dissertation submitted for the degree of Doctor of Philosophy at the University of Cambridge Copyright c July 2004 Simon Frankau ii Author Publications and Statement of Originality The research presented in this thesis has also been published in the following papers: [52] FRANKAU, S., AND MYCROFT, A., AND MOORE, S. Statically-allocated languages for hard- ware stream processing (extended abstract). In Proceedings of UK ACM SIGDA Workshop on Elec- tronic Design Automation, 2002. [51] FRANKAU, S., AND MYCROFT, A. Stream Processing Hardware from Functional Language Specifications. In Proceedings of the 36th Annual Hawaii International Conference on System Sci- ences (HICSS), 2003. Chapter 2 closely follows a publication [51] co-authored with Dr. Alan Mycroft. The presentation here of that work is entirely mine. Other chapters are are my own work. The thesis is not substantially the same as any that I have submitted for any other qualification at any other University. Signed, Simon Frankau, July 2004. iii iv Abstract As hardware designs grow exponentially larger, there is an increasing challenge to use transistor budgets effectively. Without higher-level synthesis tools, so much effort may be spent on low-level details that it becomes impractical to efficiently design circuits of the size that can be fabricated. This possibility of a design gap has been documented for some time now. One solution is the use of domain-specific languages. This thesis covers the use of software-like languages to describe algorithms that are to be implemented in hardware. Hardware engineers can use the tools to improve their productivity and effectiveness in this particular domain. Software engineers can also use this approach to benefit from the parallelism available in modern hardware (such as reconfigurable systems and FPGAs), while retaining the convenience of a software description. In this thesis a statically-allocated pure functional language, SASL, is introduced. Static alloca- tion makes the language suited to implementation in fixed hardware resources. The I/O model is based on streams (linear lazy lists), and implicit parallelism is used in order to maintain a software- like approach. The thesis contributes constraints which allow the language to be statically-allocated, and synthesis techniques for SASL targeting both basic CSP and a graph-based target that may be compiled to a register-transfer level (RTL) description. Further chapters examine the optimisation of the language, including the use of lenient evaluation to increase parallelism, the introduction of closures and general lazy evaluation, and the use of non- determinism in the language. The extensions are examined in terms of the restrictions required to ensure static allocation, and the techniques required to synthesise them. v vi Acknowledgements I would like to thank my supervisors, Simon Moore and Alan Mycroft, without whose advice and in- sight this thesis would not have been written. I also gratefully acknowledge Altera for the studentship they generously provided. Thanks to all my fellow students, for making the Computer Laboratory such an enjoyable and interesting place to (attempt to) work. Also, thanks to my family and house-mates, who have been highly supportive. This thesis is dedicated to the memory of my mother, Patricia. vii viii Contents 1 Introduction and Related Work 1 1.1 The Need for High-Level HDLs . 2 1.1.1 A Brief History of HDLs . 3 1.1.2 A Comparison to Software Languages . 5 1.1.3 Modern Hardware Development . 6 1.1.4 Runtime Reconfigurable Systems . 7 1.2 The Hardware Description Language Space . 9 1.2.1 Language Assumptions . 9 1.2.2 Example Languages . 10 1.3 The Statically-Allocated Stream Language . 14 1.3.1 SASL's Niche . 14 1.3.2 Functional Languages . 15 1.3.3 Static Allocation . 17 1.3.4 Static Allocation of Functional Languages . 17 1.3.5 SASL's I/O Model . 18 1.3.6 A Comparison to Other Languages . 19 1.4 Thesis Contributions and Organisation . 20 2 The SASL Language 23 2.1 The Motivation: SAFL and SAFL+ . 23 2.1.1 The SAFL Language . 23 2.1.2 SAFL+: An Attempt to Improve I/O . 24 2.1.3 Functional I/O . 25 2.2 Other Related Work . 27 2.3 A Na¨ıve Stream Processing Language . 29 2.3.1 The Stream-less Language . 29 2.3.2 Stream-processing extensions . 30 ix x CONTENTS 2.3.3 Problems raised . 31 2.4 Restrictions for Static Allocation . 31 2.4.1 The stratified type system . 32 2.4.2 Linearity . 33 2.4.3 Stability . 35 2.4.4 Static Allocation . 37 2.4.5 Example Programs . 37 2.5 SASL Semantics . 37 2.6 Deforestation . 38 2.7 A Comparison to SAFL+ and Synchronous Dataflow . 40 2.7.1 SAFL+ . 40 2.7.2 Lustre . 41 2.8 Summary . 45 3 Translation to CSP 47 3.1 Synthesis Aims . 47 3.2 Synthesis Outline and Function Interfacing . 49 3.3 Variable access . 51 3.3.1 Broadcast variables . 51 3.3.2 Unicast variables . 52 3.3.3 Stream Variable Access . 52 3.4 CSP Synthesis . 55 3.4.1 Non-stream CSP Synthesis . 55 3.4.2 Stream CSP Synthesis . 56 3.5 Summary . 60 4 Dataflow Graph Translation 61 4.1 Pipelining SASL . 62 4.2 Dataflow Graph Generation . 65 4.2.1 Translation to Linear SASL . 67 4.2.2 Translation to Dataflow Graph . 69 4.2.3 Graph Properties . 72 4.2.4 Node Implementation . 73 4.2.5 Other Dataflow Architectures . 76 4.3 The Control/Dataflow Graph . 76 4.3.1 Removing CONS-enclosed Tail Recursion . 77 4.3.2 Removing Direct Tail Recursion . 79 4.3.3 Node Implementation . 81 4.4 Extracting stream buses . 84 4.4.1 Stream Buses . 84 4.4.2 Stream Bus Typing . 85 CONTENTS xi 4.4.3 Typing Implementation . 87 4.4.4 Typing Examples . 87 4.4.5 Representing Stream Buses . 88 4.4.6 Managing Stream Buses . 91 4.4.7 Node Implementation . 95 4.5 Summary . 97 5 Optimisation 99 5.1 Static Scheduling . 99 5.1.1 The Problem . 100 5.1.2 ASAP and ALAP Scheduling . 101 5.2 Lenient Evaluation . 105 5.2.1 Signalling on Lenient Streams: The “Push” Model . 106 5.2.2 Cancelling Lenient Evaluation . 107 5.2.3 Basic Lenient Evaluation . 109 5.2.4 Lenient Evaluation with a Stream Bus Controller . 109 5.2.5 Changing the Evaluation Model: Lazy Tail Matching . 112 5.2.6 Rearranging Graphs for Lazy Tail Evaluation . 114 5.3 Program Transformation . 117 5.3.1 Enabling Graph Optimisations . 117 5.3.2 Peep-hole Optimisation . 120 5.3.3 Flattening Conditionals . 123 5.3.4 Removing Conditional Nodes . 126 5.3.5 Unrolling Loops . 126 5.4 Summary . 126 6 Closures and Statically-Allocated Laziness 129 6.1 Higher-order Functions as Macros . 129 6.1.1 Nested Function Definitions . 130 6.1.2 Lazily-Evaluated Closures . 132 6.2 Leniently-evaluated Expressions . 133 6.3 Statically-Allocated Laziness . 137 6.4 Summary . ..