Dfiant: a Dataflow Hardware Description Language

Dfiant: a Dataflow Hardware Description Language

DFiant: A Dataflow Hardware Description Language Oron Port1 Yoav Etsion1;2 Electrical Engineering1 Computer Science2 Technion – Israel Institute of Technology [email protected] [email protected] Abstract—Today’s dominant hardware description languages rate is impossible to describe with C++ constructs in Vivado (HDLs), namely Verilog and VHDL, tightly couple design func- HLS. Emerging HLSs, therefore, still fail to deliver a clean tionality with timing requirements and target device constraints. separation between functionality and implementation that can As hardware designs and device architectures became increas- ingly more complex, these dominant HDLs yield verbose and yield portable code, while providing general purpose HDL unportable code. To raise the level of abstraction, several high- constructs. We explore these gaps further in Section III. level synthesis (HLS) tools were introduced, usually based on In this paper, we introduce DFiant, a modern HDL whose software languages such as C++. Unfortunately, designing with goal is to allow designers to express portable hardware de- sequential software language constructs comes with a price; the signs. DFiant continues our previous work [4] to decouple designer loses the ability to control hardware construction and data scheduling, which is crucial in many design use-cases. functionality from timing constraints (in an effort to end the In this paper, we introduce DFiant, a Scala-based HDL that ”tyranny of the clock” [5]). DFiant offers a clean model for uses the dataflow model to decouple functionality from imple- hardware construction based on its core characteristics: (i)a mentation constraints. DFiant’s frontend enables functional bit- clock-agnostic dataflow model that enables implicit parallel accurate dataflow programming, while maintaining a complete data and computation scheduling; and (ii) functional regis- timing-agnostic and device-agnostic code. DFiant bridges the gap between software programming and hardware construction, ter/state constructs, accompanied by an automatic pipelining driving an intuitive functional object oriented code into a high- process, which eliminate all explicit register placements along performance hardware implementation. with their direct clock dependency. DFiant borrows and com- bines constructs and semantics from software, hardware and I. INTRODUCTION dataflow languages. Consequently, the DFiant programming Low-level hardware description languages (HDLs) such model accommodates a middle-ground approach between low- as Verilog and VHDL have been dominating the field- level hardware description and high-level sequential program- programmable gate array (FPGA) and application-specific in- ming. tegrated circuit (ASIC) domains for decades. These languages DFiant is implemented as a Scala library, and relies on burden designers with explicitly clocked constructs that do not Scala’s strong, extensible, and polymorphic type system to distinguish between design functionality and implementation provide its own hardware-focused type system (e.g., bit- constraints (e.g., timing, target device). For example, the accurate dataflow types, input/output port types). The inter- register-transfer language (RTL) constructs of both Verilog and actions between DFiant dataflow types create a dependency VHDL require designers to explicitly state the behavior of graph that can be simulated in the Scala integrated develop- each register, regardless if it is part of the core functionality ment environment (IDE), or compiled to an RTL top design file (e.g., a state-machine state register), an artifact of the timing and a TCL constraints file, followed by a hardware synthesis constraints (e.g, a pipeline register), or an artifact of the process using vendor tools. target interface (e.g., a synchronous protocol cycle delay). These semantics narrow design correctness to specific timing II. CONCURRENCY AND DATA SCHEDULING restrictions, while vendor library component instances couple ABSTRACTIONS the design to a given target device. Evidently, formulating complex portable designs is difficult, if not impossible. Finally, Concurrency and data scheduling abstractions rely heavily these older languages do not support modern programming on language semantics. In this section, we explore semantics features that enhance productivity and correctness such as of three distinctively different languages: C++1, VHDL, and polymorphism and type safety. DFiant. Emerging high-level synthesis (HLSs) tools such as Vi- Consider a function f and its implementations, as detailed vado HLS [1], Bluespec SystemVerilog [2], and Chisel [3] in Table I. Despite similar code appearance, the semantics are attempt to bridge the programmability gap. While HLSs very different, as depicted in Fig. 1. The following subsections tend to incorporate modern programming features, they still qualify these semantics. mix functionality with timing and device constraints, or lack hardware construction and timed synchronization control. For 1C++ is required because reference & variables are not available in C. example, designs must be explicitly pipelined in Chisel or More advanced C++ capabilities are not always synthesizable, thus rarely Bluespec, while a simple task as toggling a led at a given used for hardware description. TABLE I DATA SCHEDULING SEMANTICS EXAMPLE FUNCTION, f :DEFINITION AND IMPLEMENTATIONS Formal Definition Functional Drawing C++ Impl.† VHDL Impl.‡ DFiant Impl. void f(int i, f: process(clk) def f(i : DFSInt[32]) = +5 a &a,&b,&c,&d){ begin { f :(i ) ! (a ; b ; c ; d ) if rising_edge(clk) n n2N n n n n n2N i begin 8a = i + 5 *3 b > k k a=i+5; a <=i+5; val a = i+5 > <bk = ak ∗ 3 b=a 3; b <=a 3; val b = a 3 k ≥ 0 * * * , c = a + b + c aˆ <= a;--cyc delay > k k k :> c=a+ b; c <= aˆ+ b; val c = a+b dk = ik − 1 -1 d d=i-1; d <=i-1; val d = i-1 end; (a,b,c,d) //tuple of } end process; } //four † Some type annotations were removed for brevity. ‡ aˆ represents a clock cycle delay of a. f data scheduling f start f end A. C++ Semantics C++ Semantics i Sequential programming models, such as C++, do not (without 0 HLS a0 Data validity guaranteed have concurrent semantics. Data scheduling order is set by pragmas) b 2 0 outside the code statement order and cannot be pipelined . HLS utilities c scope of f extends these languages with pragma directives that change 0 d0 semantics. We observe the C++ f implementation as follows: t i i i Clock 1) All statements are variable assignments. 0 1 2 VHDL a? a0 a1 a2 2) d is independent of a , b , and c but cannot be sched- Semantics â? â? â0 â1 uled concurrently. Additionally, a cannot be safely read (synchronous process) b? b? b0 b1 until f finishes. Proper pragmas allow dataflow analysis c? c? c? c0 and function inlining to overcome these limitations. d? d0 d1 d2 3) Time between/of the data operations is unconstrained. The t i i code does not restrict the functional requirement and will 0 1 a0 a1 maintain correctness for every hardware synthesis fitting DFiant b b its semantics. Semantics 0 1 c0 c1 d d B. VHDL Semantics 0 1 t The RTL programming model is concurrent. Data schedul- ing is manual and clock-bound, while the order is set by the Fig. 1. f data scheduling semantics in C++, VHDL, and DFiant assignment cycle-time. VHDL process semantics are different for signals and variables: signals are updated when the process ends, while variables are updated instantly. When embedded match the output cycle-latencies, and implement explicit in a signal edge-detection conditional construct, both signals guards. and variables can be interpreted as registers, depending on the 4) The implementation is very fragile and has limited reusabil- context. We observe the VHDL f implementation as follows: ity. Foremost, VHDL process construct alone is not 1) All statements are synchronous signal assignments with reusable and requires an entity-architecture encapsulation an explicit single-clock dependency. Clocked f imposes for structural instantiation. Additionally, f is tightly- time restrictions to f. Although this implementation does coupled to clk timing and logic propagation delay. The not contradict the formal definition of f, its correctness is slightest change in requirements or target device can lead guaranteed solely under these restrictions. to a painful redesign. 2) A latency balancing register added to maintain correctness of the c assignment pipeline3. C. DFiant Semantics 3) Data is scheduled for every clock cycle, thus creating a DFiant has a dataflow programming model. Data scheduling pipeline. Each output signal is valid at a different time. order, or token-flow, is set by the data dependency. Essentially, Invalid outputs may be accessed, since VHDL has no the DFiant semantics schedules all independent dataflow ex- implicit guard semantics. More hardware is required to pressions concurrently, while dependent operations are synthe- sized into a guarded FIFO-styled pipeline. Dataflow branches 2 We only observe language semantics. Out-of-order or multi-processor are implicitly forked and joined. Semantically, unused nodes, executions may still apply. 3We can use VHDL variable to avoid latency balancing, by forming a always consume tokens and are discarded during compilation. combinational circuit. We observe the DFiant f implementation as follows: 1) All expressions are dataflow variable declarations. do not auto-pipeline designs. Moreover, DFiant is an early- 2) Concurrency is implicit. Function f is coded intuitively adopter of new Scala features such as literal types [11] and in a sequential manner, since dataflow dependencies are operations [12], which further improve type safety (e.g., a oblivious to statement order. DFBits[5].bits(Hi,Lo) bit selection is compile-time- 3) Data scheduling is implicitly guarded by data dependen- constrained within the 5-bits vector width confines). cies. For example, a is forked into both b and c Synflow Cx: Synflow developed Cx [13] as a designer- operations, while c joins branches from a and b . It is oriented HDL with new language semantics that better fit impossible to read an invalid result or an old result (without hardware design than the classic C syntax.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us