Session Performance Issues and Enhancement Methods + Low Power Computing and Analysis

Int'l Conf. Computer Design | CDES'12 | 1 SESSION PERFORMANCE ISSUES AND ENHANCEMENT METHODS + LOW POWER COMPUTING AND ANALYSIS Chair(s) TBA 2 Int'l Conf. Computer Design | CDES'12 | Int'l Conf. Computer Design | CDES'12 | 3 Reducing Energy Usage of NULL Convention Logic Circuits using NULL Cycle Reduction Combined with Supply Voltage Scaling Brett Sparkman and Scott C. Smith Department of Electrical Engineering, University of Arkansas, Fayetteville, AR, U.S.A. [email protected] and [email protected] Abstract - The NULL Cycle Reduction (NCR) technique can be used NCL uses symbolic completeness of expression [4] to to improve the performance of a NULL Convention Logic (NCL) achieve delay-insensitive behavior. A symbolically complete circuit at the expense of power and area. However, by decreasing the expression is defined as an expression that only depends on supply voltage, the power of the NCR circuit can be reduced. Since the relationships of the symbols present in the expression NCR has increased performance, it should be possible to reduce without a reference to their time of evaluation. In particular, supply voltage to decrease overall power while maintaining the dual-rail signals or other Mutually Exclusive Assertion Groups original performance of the circuit. To verify this, the NCR circuit (MEAGs) can be used to incorporate data and control was implemented using a 4-bit by 4-bit dual-rail multiplier as the test information into one mixed signal path to eliminate time circuit, which was simulated in ModelSim to ensure functionality, reference [5]. A dual-rail signal, D, consists of two wires, D0 synthesized into a Verilog netlist using Leonardo, and imported into 1 Cadence to perform transistor-level simulation for power and D , which may assume any value from the set {DATA0, DATA1, NULL}. The DATA0 state (D0 = 1, D1 = 0) calculations. The supply voltage of the duplicate NCR circuits was 0 decreased until the performance matched the design of the original corresponds to a Boolean logic 0, the DATA1 state (D = 0, D1 = 1) corresponds to a Boolean logic 1, and the NULL state multiplier, resulting in approximately 25% overall lower energy 0 1 usage. (D = 0, D = 0) corresponds to the empty set meaning that the value of D is not yet available. The two rails are mutually Keywords: asynchronous circuits; NULL Convention Logic (NCL); exclusive so that both rails can never be asserted NULL Cycle Reduction (NCR); supply voltage scaling simultaneously; this state is defined as an illegal state. Dual- rail signals are space optimal 1-out-of-N delay-insensitive I. INTRODUCTION codes requiring two wires per bit. As circuits are continually produced with increasing Most multi-rail delay-insensitive systems [1,4,6], including numbers of transistors and switching frequencies, circuit power NCL, have at least two register stages, one at both the input also increases. Although these improvements drastically raise and at the output. Two adjacent register stages interact through circuit performance, they also have a downside of consuming their request and acknowledge lines, Ki and Ko, respectively, to larger amounts of power, which cause circuits to heat up and prevent the current DATA wavefront from overwriting the last a shorter amount of time on a single battery charge. previous DATA wavefront, by ensuring that the two DATA This paper demonstrates that overall circuit energy can be wavefronts are always separated by a NULL wavefront. reduced for NULL Convention Logic (NCL) circuits [4], while NCL differs from the other delay-insensitive paradigms maintaining equivalent performance, by applying the NULL [1,6] in that these other paradigms only utilize one type of Cycle Reduction (NCR) technique [8] and reducing the supply state-holding gate, the C-element [3]. A C-element behaves as voltage of the duplicate circuits. follows: when all inputs assume the same value, then the output assumes this value; otherwise the output does not II. PREVIOUS WORK change. On the other hand, all NCL gates are state-holding. A. Introduction to NULL Convention Logic Thus, NCL optimization methods can be considered as a NCL offers a self-timed logic paradigm where control is subclass of the techniques for developing delay-insensitive inherent with each datum. NCL follows the so-called ‘‘weak circuits using a pre-defined set of more complex components, conditions’’ of Seitz’s delay-insensitive signaling scheme [1]. with built-in hysteresis behavior. As with other self-timed logic methods, the NCL paradigm NCL uses threshold gates for its basic logic elements [7]. assumes that forks in wires are isochronic [2]. The origins of The primary type of threshold gate is the THmn gate, where various aspects of the paradigm, including the NULL (or 1 ≤ m ≤ n, as depicted in Fig. 1. THmn gates have n inputs. At spacer) logic state from which NCL derives its name, can be least m of the n inputs must be asserted before the output will traced back to Muller’s work on speed-independent circuits in become asserted. Because NCL threshold gates are designed the 1950s and 1960s [3]. with hysteresis, all asserted inputs must be de-asserted before the output will be de-asserted. Hysteresis ensures a complete 4 Int'l Conf. Computer Design | CDES'12 | transition of inputs back to NULL before asserting the output delay-insensitive paradigms [1,6] must maintain in order to be associated with the next wavefront of input data. Therefore, a delay-insensitive, requires the following criteria: 1. all the THnn gate is equivalent to an n-input C-element and a TH1n outputs of a combinational circuit may not transition from gate is equivalent to an n-input OR gate. In a THmn gate, each NULL to DATA until all inputs have transitioned from NULL of the n inputs is connected to the rounded portion of the gate; to DATA, and 2. all the outputs of a combinational circuit may the output emanates from the pointed end of the gate; and the not transition from DATA to NULL until all inputs have gate’s threshold value, m, is written inside of the gate. NCL transitioned from DATA to NULL. In circuits with multiple threshold gates may also include a reset input to initialize the outputs, it is acceptable, according to Seitz’s weak conditions output. Resettable gates are denoted by either a D or an N [1], for some of the outputs to transition without having a appearing inside the gate, along with the gate’s threshold, complete input set present, as long as all outputs cannot referring to the gate being reset to logic 1 or logic 0, transition before all inputs arrive. respectively. Furthermore, circuits must also adhere to the completion- By employing threshold gates for each logic rail, NCL is completeness criterion [9], which requires that completion able to determine the output status without referencing time. signals only be generated such that no two adjacent DATA Inputs are partitioned into two separate wavefronts, the NULL wavefronts can interact within any combinational component. wavefront and the DATA wavefront. The NULL wavefront This condition is only necessary when the bit-wise completion consists of all inputs to a circuit being NULL, while the strategy is used with selective input-incomplete components, DATA wavefront refers to all inputs being DATA, some since it is inherent when using the full-word completion combination of DATA0 and DATA1 for dual-rail inputs. strategy and when using the bit-wise completion strategy with Initially, all circuit elements are reset to the NULL state. First, no input-incomplete components [9]. a DATA wavefront is presented to the circuit. Once all of the outputs of the circuit transition to DATA, the NULL One more condition must be met to ensure delay- wavefront is presented to the circuit. After all of the outputs of insensitivity for NCL and other delay-insensitive circuits [1,6]. the circuit transition to NULL, the next DATA wavefront is No orphans may propagate through a gate [10]. An orphan is presented to the circuit. This DATA/NULL cycle continues defined as a wire that transitions during the current DATA repeatedly. As soon as all outputs of the circuit are DATA, the wavefront, but is not used in the determination of the output. circuit’s result is valid. The NULL wavefront then transitions Orphans are caused by wire forks and can be neglected all of these DATA outputs back to NULL. When the outputs through the isochronic fork assumption [2] as long as they are transition back to DATA again, the next output is available. not allowed to cross a gate boundary. This observability This period is referred to as the DATA-to-DATA cycle time, condition, also referred to as indicatability or stability, ensures that every gate transition is observable at the output, which denoted as TDD, and has an analogous role to the clock period in a synchronous system. means that every gate that transitions is necessary to transition at least one of the outputs. B. Introduction to NULL Cycle Reduction The NCR technique for reducing the NULL cycle, thus increasing throughput for any delay-insensitive circuit developed according to the paradigms [1,4,6], is shown in Fig. 2. The NCR architecture in Figure 2 is specifically Figure 1. Thmn threshold gate. [8] designed for dual-rail circuits utilizing full-word completion, where all bits at the output of a registration stage are conjoined The completeness of input criterion [4], which NCL to form one completion signal. combinational circuits and circuits developed from other Figure 2. NCR architecture. Int'l Conf. Computer Design | CDES'12 | 5 Circuit #1 and Circuit #2 are both dual-rail delay- Verilog netlist, which was then imported into Cadence.

Session Performance Issues and Enhancement Methods + Low Power Computing and Analysis

THINC: a Virtual and Remote Display Architecture for Desktop Computing and Mobile Devices

Interconnect Solutions Short Form Catalog

Publication Title 1-1962

Opening Plenary March 2021

Table of Contents

What Matters to You Matters to Us 2013 ANNUAL REPORT

ATCO 2016 YE MPC.Pdf

Architectural Support for Real-Time Computing Using Generalized Rate Monotonic Theory

Bi-Directional Optical Backplane Bus for General Purpose Multi-Processor B Oard-To-B Oard Optoelectronic Interconnects

CSSB) of Manitoba, Canada Llsecurity (Confidential Employee Information; Financial Asset Management)

BUS and CACHE MEMORY ORGANIZATIONS for MULTIPROCESSORS By

COVID-19 – Emergency Management Tips and Practices for Bus Transit Systems