Energy Efficient And High Performance 64-bit Using 28nm Technology

Shruti Murgai*, Ashutosh Gupta**, Gayathri Muthukrishnan*** M.Tech Scholar*, ***, Asst. Professor**, Department of ECE ASET, AMITY University, Noida, INDIA [email protected]*, [email protected]**, [email protected]***

Abstract—Arithmetic Logic Units are one of the vital unit in The rest of the paper is organized as follows. In Section II general purpose processors and major source of power the previous work in the field of arithmetic logic unit and dissipation. In this paper we have demonstrated an optimized carry select adders are discussed. In Section III, we have Arithmetic and Logic Unit through the use of an optimized carry discussed design of our 64-bit ALU. In section IV, we have select . Carry select adders have been considered as the best presented various results along with simulation window. in their category in terms of power and delay. In this context a full adder optimized in terms of power has been used in Section V concludes the paper. synthesizing a carry select adder. Combined with the new adder II. PREVIOUS WORK structure, there is a substantial improvement in terms of power and delay. The total device power and hierarchy power has been based full adder cell was proposed by Alhalabi, B. reduced to 12.5 % and 53.39 % respectively. 3 % reduction in and Al-Sheraidah [1] in 2001 that uses 23 % less power and total completion time has also been observed. The circuit has was 64 % faster. The use of multiplexer not only reduces the been synthesized on kintex FPGA through Xilinx 14.3 using 28 nm technology in Verilog HDL and results has been simulated on transition activity and charging recycling capability, but also Modelsim 10.3c. The design is verified using System Verilog on make entire signal gates directly excited by the fresh input QuestaSim in UVM environment. signals leading to noticeable reduction in short-current power consumption. 4-bit Arithmetic and Logic Unit was proposed in Keywords— ALU, Carry Select Adder, Power, FPGA. 2011 that performed eight functions. An optimized full adder circuit was used for performing addition and subtraction

I. INTRODUCTION operations. More than 70 % reduction in power and area was In the era of growing System-On-Chip industry and observed as compared to conventional design [2]. Carry select scaling of devices up to nanometre regime, for the Production adders are known for their speed. A low power consuming of any VLSI chip, we ought to focus on power and area Carry select adder can be an asset for any SOI [3], [4], [6]. An requirement and propagation delay of the design. The densely packed transistors on single chip have led to an increase in efficient full adder design can been used to optimize the big power dissipation. Power is the key concerned because of the designs. In 2013 again 4-bit Arithmetic and logic unit was noteworthy growth in the field of personal computing devices proposed based on gate diffusion technique performing same and wireless communication system which are demanding eight operations [5]. We have designed a 64-bit Arithmetic complex functionality and high speed computation with low and logic unit performing eight operations based on a power consumption. Need for lower power consuming devices multiplexer based optimized full adder circuit. We are also in- continue to increase drastically as components are becoming process of implementing this design on reconfigurable system as smaller, battery-powered and require more functionality in implemented in [7-11] today’s era. The benefit of utilizing a blend of low-power components in union with low-power design techniques is more important now than ever before. III. ARITHMETIC AND LOGIC UNIT Addition is the most fundamental arithmetic operation Arithmetic logic unit is the basic building block of any central among all others. , multipliers etc. all have adders as processing unit and is found in every now-a- their basic functional unit. days. In this paper the proposed Arithmetic and Logic Unit performs eight operations that are addition, subtraction, The main contribution of the paper is that we have designed increment, decrement, XOR, AND, EX-NOR and OR a 64-bit Arithmetic and Logic Unit for the computation of depending upon the select line which is shown as RTL eight functions. This paper make use of the optimized carry schematic diagram as shown in figure1 . Table 1 illustrate the select adder block. Optimization has been carried out by truth table for the operations performed by the Arithmetic and reducing the internal logics in the circuit through the use of a Logic Unit based on the status of the select signal. multiplexer based full adder circuit.

978-1-4799-8792-4/15/$31.00 c 2015 IEEE 453 TABLE I. OPERATIONS OF ARITHMATIC & LOGIC UNIT[2],[5] dynamic power of device by reducing the logics and signals required by it. With the use of in the circuit, we SELECTION LINES OPERATION are able to reduce the switching activities at the internal nodes SEL[2] SEL[1] SEL[0] and reduce the power of the circuit. The propagation delay has 0 0 0 AND also been reduced. Fig. 3, shows how these full adders are further cascaded to form 4-bit ripple carry adder. The ripple 0 0 1 EXNOR carry adders are those adders that add the carry with the next 0 1 0 EXOR transaction. Multiple full adder modules can be cascaded one 0 1 1 OR after another to add a large number of data. 1 0 0 ADDITION These 4 bit adders are cascaded together in a 32-bit carry 1 0 1 SUBTRACTION select adder. The carry-select type adder normally comprises of two ripple carry type adders and a multiplexer. Addition of 1 1 0 INCREMENT two n-bit numbers with a carry-select type adder is done with 1 1 1 DECREMENT the help of two adders (that are two ripple carry type adder’s) so as to perform the calculations twice. Fig. 4 shows the organization of 64-bit carry select adder. First time, the assumption of the carry being zero is done and the other assuming one. The speed of the is fasten up as later when the two results are calculated, the true sum, as well as the true carry, is then chosen with the help of multiplexer once the correct carry is known to multiplexer. There are various kinds of carry select adders used in the proposed design. The carry select adders are the perfect compromise of area, speed, power and delay. The Device generated through the use of these carry select adders is proven to be efficient in terms of power and delay.

Fig.1. RTL Schematic Diagram of 64-Bit Arithmetic And Logic Unit

For SEL=10X (where X symbolizes don’t care, either 0 or 1) the operation will take as 100 for addition. 64-bit optimized carry select adder will be used for addition. For SEL= 101 complement of second input will be added with another input using same carry select adder design. SEL 110 will add 1 to first input and SEL 111 will subtract 1 from first input only.

Fig.2. Optimized Full Adder Circuit A. CARRY SELECT ADDER Carry select adder used in this context has used optimized full adder circuit using multiplexers. Fig. 2, shows the full adder used in this design. The full adder implemented with the help of multiplexer has proven to reduce the logic used in the device. This methodology has given us better results than previous implemented arithmetic and logic units. The outputs, sum and carry generated through this full adder circuit are:

y = a ^ b sum = y ^ b carry = y ? c : a

The total power of any device is the sum of static and Fig.3. 4-Bit Ripple Carry Adder dynamic power. Dynamic/Switching power is mainly due to charging and discharging of load capacitors driven by the circuit. The frequent toggling between internal nodes increases the dynamic power. The static power of the proposed arithmetic logic unit is almost same as compared to the devices available in market This design has reduced the

454 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Fig.5. Graph comparing various power consumption values of conventional and proposed arithmetic and logic unit

Fig.4. 64-Bit Carry Select Adder IV. SIMULATION RESULTS AND DISCUSSIONS This section describes performance of the proposed ALU using Xilinx ISE Tool on 28nm technology. Results of Proposed Arithmetic & Logic unit are matched with the results of the conventional design. It became essential to implement the Arithmetic and Logic units in the same technology node for comparison as it is impossible to compare the two designs unless their technology node is same. So. Both ALU’s are simulated in Modelsim simulator 10.3c, using KINTEX device family on 28nm technology. The design of the circuit has been synthesized using Verilog HDL. The power utilized by the device was calculated with the help of XPower Analyzer tool of XILINX ISE DESIGN SUITE 14.3. Fig. 5 and 6 shows the graph comparing various power i.e. clock, logic, signal, IOs, static power, dynamic power and total power consumption values and hierarchy powers of conventional and proposed arithmetic and logic units. Fig. 7, Shows the graph comparing Fig.6. Graph comparing the hiearchy power of the designs the logics used in the proposed and conventional designs. The reduction in logics leads to the reduction in the power dissipation of the device. Total On-chip power and time summary is listed in table II and III. Figure 8 shows the Simulation waveform generated through Modelsim for the 64- bit Arithmetic and Logic Unit. The total device Power and hierarchy power has been reduced to 12.5 % and 53.39 % respectively. 3 % reduction in total completion time has also been observed from the synthesis report.

TABLE II. ON CHIP TIME SUMMARY PROPOSED CONVENTIONAL ON CHIP (sec) (sec) Total real time to 65.00 67.00 xst completion Total cpu time to 65.57 67.57 xst completion Fig.7. Graph comparing number of logics used in the design

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) 455

Fig.8. Simulation window of 64-bit Arithmetic and Logic Unit

TABLE III. ON CHIP POWER AND TIME SUMMARY

PROPOSED CONVENTIONAL V. CONCLUSION ON CHIP POWER POWER USED USED 64-bit Arithmetic logic Unit has been successfully designed, (mW) (mW) simulated and optimized. This design has been synthesized CLOCK 0.16 - 1.59 - using Verilog HDL. Further the arithmetic logic unit has been verified successfully using System Verilog on QUESTASIM. LOGIC 0.76 496 3.53 504 The simulation has been done to ensure perfect working of the design. The test bench has been written in System Verilog SIGNAL 6.63 752 10.83 759 UVM environment to ensure the design is free from any kind IOS 35.98 197 39.10 197 of bugs. The power of the proposed design can be reduced further when implemented with the cadence design tools at STATIC POWER 45.32 - 45.35 - transistor level. DYNAMIC POWER 43.53 - 55.05 - TOTAL 88.85 - 100.40 -

[7] Ashutosh Gupta and Kota Solomon Raju, “Design and REFERENCES Implementation of 32-bit Controller for Interactive Interfacing with Reconfigurable Computing Systems” International Journal of Computer Science and Information Technology (IJCSIT), Vol.1, [1] Alhalabi, B.; Al-Sheraidah, A., "A novel low power multiplexer- No.2, pp 80-87, Nov 2009. ISSN: 0975-3826(online); 0975-4660 based full adder cell," Electronics, Circuits and Systems, 2001. [8] Gupta. Ashutosh, Duhan. Manoj and Raju Kota. Solomon, HDL ICECS 2001. The 8th IEEE International Conference on , vol.3, no., Implementation of Sine-Cosine Function Using CORDIC Algorithm pp.1433,1436 vol.3, 2001 in 32-Bit Floating Point Format (June 19, 2009). The Icfai University [2] Rani, T.E.; Rani, M.A.; Rao, R., "AREA optimized low power Journal of Science & Technology, Vol. 5, No. 2, pp. 40-48, June arithmetic and logic unit," Electronics Computer Technology 2009. (ICECT), 2011 3rd International Conference on , vol.3, no., [9] Parul Sharma and Ashutosh Gupta, “Design, Implementation and pp.224,228, 8-10 April 2011 Optimization of highly efficient UART,” The IUP Journal of Science [3] Parmar, S.; Singh, K.P., "Design of high speed hybrid carry select & Technology, Vol. 5, No. 4, pp. 21-30, December 2009 adder," Advance Computing Conference (IACC), 2013 IEEE 3rd [10] Gaur, Nidhi; Gupta, Ashutosh; Sharma, Anil Kumar; Malviya, Rahul, International , vol., no., pp.1656,1663, 22-23 Feb. 2013 "HDL implementation of prepaid electricity billing machine on [4] Ramakrishna Reddy, A.; Parvathi, M., "Efficient carry select adder FPGA," Confluence The Next Generation Information Technology using 0.12μm technology for low power applications," Advances in Summit (Confluence), 2014 5th International Conference - , vol., no., Computing, Communications and Informatics (ICACCI), 2013 pp.972,975, 25-26 Sept. 2014 International Conference on , vol., no., pp.550,553, 22-25 Aug. 2013 [11] Agarwal, C.; Gupta, A, "Modeling, simulation based DC motor speed [5] Dubey, V.; Sairam, R., "An Arithmetic and Logic Unit Optimized for control by implementing PID controller on FPGA," Confluence 2013: Area and Power," Advanced Computing & Communication The Next Generation Information Technology Summit (4th Technologies (ACCT), 2014 Fourth International Conference on , International Conference) , vol., no., pp.467,471, 26-27 Sept. 2013. vol., no., pp.330,334, 8-9 Feb. 2014 . [6] Yousuf, R.; Najeeb-ud-din, "Synthesis of carry select adder in 65 nm FPGA," TENCON 2008 - 2008 IEEE Region 10 Conference , vol., no., pp.1,6, 19-21 Nov. 2008 doi: 10.1109/TENCON.2008.4766397

456 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)