A Power Efficient Register File Architecture Using Master Latch Sharing

A Power Efficient Register File Architecture Using Master Latch Sharing

A POWER EFFICIENT REGISTER FILE ARCHITECTURE USING MASTER LATCH SHARING M. Wrdblewski', M. Mueller'p2, A. Wortmann2,S. Simon', U! Piepe?, J. A. Nossekl Munich University of Technology Hochschule Bremen Marek.Wroblewsla @ei.tum.de [email protected] ABSTRACT 4 This paper introduces a method of reducing area and power con- D Q sumption of a synthesizable register tile by using a single master latch shared by a number of slaves. It investigates potential timing problems and discusses possible solutions. Presented simulation results show that, depending on the size of the register tile, reduc- tion of power consumption of more than 50% is achievable. CK 1: INTRODUCTION Wold-level Regmter The reduction of power dissipation in modem VLSI circuits is one h, DI &I of the most important.challenges. With increased die size and ad- vanced process technologies, whole systems with an ever-growing number of transistors are integrated on a single chip. The prob- Dz Q2 lem of heat dissipation in low cost packages of high performance systems [4] and a growing demand for portable consumer prod- ucts with an independent power supply drive the development and application of low-power design techniques [I]. ii Data stores are awimportant power critical part of resource D,,, sharing architectures (21 or processing units, like application spe- cific instruction set processors (ASPS). They are preferably im- CK plemented as a synthesizable register file described as part of the design on register transfer level, because of a high effort required for timing verification of RAM. Figure I : a) Master-slave D-flip-flop. b) Word-level register. Therefore, in this paper we consider data stores implemented in this manner. In general. they consist of a number of word level registers. The data inputs of the registers are connected to the data 2. MODIFIED REGISTER FILE bus and represent a high capacitive load. This results in a high power consumption per signal transition. While the number of un- Storage elements found in libraries of semicustom technology pro- necessary transitions can he kept low by integration of glitch bar- viders are frequently realized as edge-triggered master-slave D- riers into the signal path, further improvements can be achieved flip-flops as shown in Figure 1 a). They consist of a master and a by reduction of the capacitance of the data bus, e. g. as proposed slave latch. If a value is to he stored into the flip-flop, the master in 131. A rearrangement of the registers in the register tile is per- latch freezes the value with the triggering edge of the clock signal formed, reducing the number of registers connected directly to the and the slave latch becomes transparent. The value is visible at the data bus. outpur of the flip-flop. At the other edge of the clock signal, the This paper presents a method which also aims at reduction of slave latch stores the value and the master latch gets transparent. capacitance connected to the data bus. This is achieved by splitting For data stores, w flip-flops are grouped to word-!eve1 registers, up the master-slave flip-flops into the master latches and the slave where w denotes the number of bits of the data signal. Each bit latches. If clock gating is applied, slave latches of registers in of the data signal is connected to one flip-flop of the word-level the register tile can share one master latch. Thus the number of register, as shown in Figure I b). Register files, in which r data master latches connected to the data bus is decreased. Additionally values can be stored, consist ofr word-level registers, as shown in savings in area can he expected. Figure 2a). Since the data inputs of all registers are connected to The paper is organized as follows: In Section 2, the moditied the data bus this leads to a high bus capacitance. register tile is denved and explained. Section 3 addresses timing- TWO approaches are commonly used to update data stored in related issues involved in application of this method. In Section 4 registers. First, a feedback loop from the output is connected via experimental results are presented. a multiplexer to the input of the flip-flop. If a new value is to 0-7803-7761-3/03/517.00 02003 IEEE v-393 ..-......- G I c.n Figure 2 a) Conventional register file with flip-flops. b) Modified register file with shared master latches. be stored it is routed via the multiplexer to the input. Otherwise This configuration has the following advantages: the output of the flip-flop is connected to the input and the stored Since there is only one master latch connected to the data information is rewritten in every clock cycle. Additional enable bus, the capacitive load is reduced. Hence, a reduction of logic is responsible for controlling the multiplexer. power dissipation in driving modules can be expected. The second approach is the application of clock gating. Due to a Transitions seen on the data bus cause only the internal ca- significantly reduced power consumption this is usually preferred pacitances of a single latch to be recharged, and this only and in the following we only consider this case. This method was during second half of clock period (the shared master latch applied to the register file shown in Figure 2. Here, logic and possi- is clocked in every clock period). This reduces power con- bly sequential cells (represented as blockCC in Fig. 2) are inserted sumption of the register file. into the clock path. The same enable logic as in the previous case e Less area is required for the register file (and possibly for is used, this time however to decide whether to clock a given reg- the driving blocks). ister. Thus data is written only when it changes and clock signals of not selected registers are disabled. 3. As consequence however every master latch is transparent all TIMING the time except when new data written into its slave. Every is Obviously the reduction of power consumption comes at a cost. transition on the data bus repeated by all master latches, although is By splitting up the master and slave latches of a flip-flop we give most of them (usually all but one) do not need to pass the new up the main advantage of the foundry-designed flip-flop - the well state to their slaves. This requires that internal capacitances of defined timing conditions. The master and the slave placed in a these latches (and input capacitances slaves) recharged and of be single cell are adjacent and any clock signal delays can be com- increases power consumption. pensated when designing the cell. When using clock gating in this In the following we limit our investigations to architectures scenario both the master and the slave are clock gated. where only a single piece of data can be stored per clock cycle. It It is not so in the proposed approach. We only apply clock may he stored into many registers simultaneously, but all registers gating to the slave latches, as master is shared and operates in every with clock signal enabled receive the same value. clock cycle. As the slaves' clock signal is delayed by the clock Under this assumption it is possible to replace all r master gating cells it cannot arrive at the clock inputs of the slaves at the latches connected to a specific bit-line of the data bus by one single same instant as the clock signal of the master arrives at its clock master latch. The corresponding slave latches share this master input. This may lead to serious functionality problems as we show latch. This is shown in Figure 2 b). later. v-394 I la' I Ibl I 2a I 2b ' Figure 3: Enable signals of master and slave latches of a split-up flip-flop. I GK Secondly we have vittually no control over the physical place- Figure 4 Synchronization of clock signals of master and slave ment of the components of a single flip-flop. In a large design the latches. delays introduced by careless placement may render the design un- usable. We discuss this issue in section 3.2. is the critical case that we address in the following in more detail. 3.1. Clock gating related timing issues From the above said it becomes evident that if we can avoid Let us first evaluate the effects of the delay introduced by clock case 2b (i. e. Gemaster leads Gaslave) then we should be able gating cells. to guarantee the proper function of the split-up flip-flop. As ex- For sake of the following discussion let us assume that we use plained it is not imperative that Gemaster and G@slavebe rising-edge triggered flip-flops and their slave latches are transpar- perfectly synchronous. In order to achieve this we may delay ent when their enable signalsG are low. We consider the following Gemaster by inserting a quasi clock gating cell as shown in cases: Fig. 4. We use ORed enable signals of the slaves as the enable 1. GOmaSter follows GOslave (cf. lefl portion of the dia- signal for the master. Assuming modest logic depth of the slaves' gram in Fig. 3). enable logic, the latch in the cell is able to suppress any glitching activity generated there, as it is closed during the first half of a (a) Short before the rising edge of Gemaster both clock cycle. The master clock signal is delayed by a simple two- latches are transparent and glitches of the data sig- input gate, exactly as is the case with the clock signal of slaves.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us