Energy-Efficient Near-Threshold Standard Cell Library for Iot

Energy-Efficient Near-Threshold Standard Cell Library for IoT Applications AbdelRahman Hesham1, Amin Nassar1, and Hassan Mostafa1,2 1Electronics and Communications Engineering Department, Cairo University, Egypt. 2University of Science and technology, Nanotechnology and Nanoelectronics Program, Zewail City of Science and Technology, October Gardens, 6th of October, Giza 12578, Egypt. [email protected], [email protected], [email protected] Abstract—In this paper, a low-energy minimum-area CMOS a much better energy/performance trade-off than subthreshold standard cell library suitable for IoT applications is proposed. operation [3]. Energy consumption reduction is achieved by operating the In addition to computation power and energy concerns in library in Near-Threshold Voltage (NTV) region, and by designing layout of cells at the minimum possible area for the used IoT applications, the possible threats deriving from widespread technology process. Body biasing technique is proposed to boost adoption of such a technology are stressed. As predicted by pMOS performance. Operating voltage and transistor sizing are Cisco, there will be 50 billion IoT connected devices by 2020 also selected to achieve the minimum energy consumption while [6]. Integration of such a tremendous number of devices into operating at the frequency range of 1MHz to 20MHz which is IoT potentially brings in a new concern, System Security. suitable for IoT applications. The proposed library was designed and characterized in UMC 130 nm CMOS technology process. Thus, cryptography became one of the main approaches to The library was modeled to be used in synthesis tools. To prove secure and overcome attacks on user data. Significant research the benefit for IoT applications, the library was benchmarked by effort was done to provide algorithmic level cryptography implementing 3 cryptographic algorithms: ASCON, AEGIS-128, solutions and to assess hardware implementation of these and AEZ. Synthesis results are showing that the three cores can algorithms. In [7], ASIC implementation of 3 commonly used operate at 18 MHz, 14 MHz, and 16 MHz respectively, while consuming 0.466 pJ, 3.006 pJ, and 5.064 pJ. cryptography algorithms is conducted to check their suitability Index Terms—CMOS digital integrated circuits, design for IoT applications. Similarly, architectural solutions are methodology, near-threshold CMOS circuits, ultra-low power proposed in [8] and used to implement co-processor suitable design, ultra-low energy, IoT, ASCON, AEGIS, AEZ for Narrow-Band (NB) IoT devices. Also, in this direction and in 2012, the European Network of Excellence in Cryptology I. INTRODUCTION (ECRYPT) called for a new competition for authenticated The Internet of Things (IoT) is a novel paradigm that is ciphers: CAESAR (Competition for Authenticated Encryption: rapidly gaining ground in the scenario of modern wireless Security, Applicability, and Robustness) [9]. Review of hard- communication. This is resulting in a wide range of applica- ware implementation of CAESAR algorithms are presented tions touching every aspect of our life, like wearable devices, in [10], [11]. The winning algorithms were announced in connected cars, and smart homes. Enablement of IoT spread 2019, and they were covering 3 main areas of application: is depending on the reliability of devices accessing it. Device lightweight, high performance, and defense-in-depth. In this reliability is based on: (1) computation power, (2) efficiency of paper, 3 of these applications were explored for potential energy consumption, and (3) security against possible threats. benefit in IoT devices. Computation power of IoT devices has been widely ex- In this paper, a low-energy minimum-area standard cell plored in the literature [1]–[5], with a major interest in energy library is proposed in UMC 130 nm process. Library is reduction techniques. These techniques have been explored operating in NTV region at 350 mV supply. This supply was and proposed at all design levels; namely circuit, logic, RTL, selected to achieve the minimum Power Delay Product (PDP) algorithm, and system levels. For IoT devices, most of energy for the used process node. The minimum area is achieved saving is achieved by circuit-level solutions. One of the most by minimizing cell height and by utilizing Euler’s Path in growing topics in this area is the feasibility of voltage scaling design of each cell layout. A calculation method is proposed to reduce energy consumption. Circuit design in subthreshold to get minimum cell height as a function of technology region has gained increasing interest to achieve low power parameters. The Inverse Narrow Width Effect (INWE) was requirements by operating the circuits at the Minimum Energy checked and utilized for library PPA gains. The proposed Point (MEP) [1]. The drawback of this MEP paradigm is the library is showing better energy and area measurements com- significant loss in performance [2]. To recover some of the pared to other libraries surveyed. To show the gains from performance loss while maintaining the power gains, Near- utilizing NTV operation, the library is benchmarked against Threshold Voltage (NTV) operation was introduced [2]. It was a commercial library operating in Super-Threshold Voltage shown that in NTV, energy savings can be in the order of region to implement three of CAESAR finalists: ASCON [12], 10X, with only a 10X degradation in performance, providing AEGIS-128 [13], and AEZ [14]. The rest of the paper is organized as follows. Section II describes the library architecture design and discusses the pro- n = max(Hmin;F EOL;Hmin;BEOL)=Pm2 (4) posed solutions for improving library PPA. Then,comparison result with similar library is presented. Section III describes where n is the minimum cell architecture tracks for a given the flow used for designing the set of cells constructing technology. This equation is technology-independent, and can the library. Section IV reports the benchmark results of the be applied to planar CMOS process nodes. proposed library using cryptography IPs. Lastly, the paper is concluded in Section V. II. LIBRARY ARCHITECTURE DESIGN In digital design using standard-cell approach, all modules must have the same height [15], [16]. The placement of standard cells has to be aligned with some pre-specified standard-cell rows in the placement region. And because of its popularity, most placement algorithms assume a standard- cell design style. The minimum cell dimensions or in other words the minimum resolution of cell placement in 2D area is defined as Unit Tile. All the cells in standard-cell library have to be of –or multiples of– unit tile height and to have a width that is a multiple of unit tile width. A. Minimum Cell Height Design Here, we are proposing a method to define the minimum cell height for a given process node. All the drawings inside cell must meet the Design Rule Checks (DRC). And given that the shapes drawn inside any cell are from Front-End Of Line (FEOL) layers or Back-End Of Line (BEOL) layers, rules related to both layer groups must be considered. The process parameters needed for FEOL checks are: minimum diffusion enclosure inside NPLUS (ENn), minimum diffusion enclosure inside PPLUS (ENp), minimum diffusion width inside NPLUS (Wn), and minimum diffusion width inside PPLUS (Wp). From Fig. 1, the minimum cell height Fig. 1. Technology FEOL parameters to calculate minimum tracks. (Hmin,FEOL) is calculated as: Hmin;F EOL = 2ENp + 2ENp + Wn + Wp (1) For UMC 130nm process, and based on the process Design Rule Manual (DRM), the minimum cell height tracks is calcu- Then, to calculate H from BEOL rules, we need to construct min lated to be 5T. This is the minimum number of tracks reported the layout given in Fig. 2. The limitation here is coming from in the literature. In [17], a 6T architecture is developed, in the first metal layer (metal 1). (1) Any metal 1 shape must [18]–[21] 8T–12T architectures are developed by utilizing keep a spacing distance of S to other metal 1 shapes, (2) m1 multi-finger structure to achieve more area efficiency. any via connecting metal 1 to poly shape or diffusion shape (CONT) must be enclosed by metal 1 shape, and finally (3) PG B. Cell Layout Design rails needs to be drawn on metal 1 to provide supply voltage to transistors. The minimum PG rail width is the minimum Implementation of 5T architecture is enabled by: using metal 1 width (Wm1). So: Euler’s path theory for layout design, and using 3 metal layers for intra-cell routing. Hmin;BEOL = 4Sm1 + 3CEm1 + Wm1 (2) Euler’s path theory [22]–[24] is used in [25] to provide In order to avoid any routing problem, either internally to the minimum-area transistor placement inside given cell area. connect transistors, or externally when the cell gets connected The resulting transistor placement can be routed in multiple to other cells, cell height must be integer multiples of the pitch ways. The selected routing of each cell used in our library of the first horizontal metal layer -metal 2 in our proposed considers: (1) pin accessibility when used in Placement and Routing tools, (2) minimizing pin capacitance, and (3) meeting library- (Pm2). technology DRC rules. Three metal layers are used for intra- cell routing. Metal 1 and metal 2 are used extensively, while Hmin = nPm2 (3) metal 3 is rarely used when all metal 1 and metal 2 resources And from Equations (1), (2), (3), we can define: are used. Fig. 3. NMOS threshold voltage vs transistor width at different supply Fig. 2. Technology BEOL parameters to calculate minimum tracks. voltages in 130 nm. C. Transistor Sizing Transistor sizing is one of the main factors deciding the library power and speed performance. The used 5T architecture sets a limit on the maximum transistor sizing: Wn + Wp ≤ 5Pm2 − 2ENp − 2ENn (5) For the used process, this limit is calculated to be 1.04 µM.

Energy-Efficient Near-Threshold Standard Cell Library for Iot

Outline ECE473 Computer Architecture and Organization • Technology Trends • Introduction to Computer Technology Trends Architecture

The Advantages of FRAM-Based Smart Ics for Next Generation Government Electronic Ids

Nano-Cmos Scaling Problems and Implications

3D Massively Parallel Processor with Stacked Memory)

WP Microelectronics and Interconnections

Madison Processor

Low-Noise Design Issues for Analog Front-End Electronics in 130 Nm and 90 Nm CMOS Technologies

45 Nm Process

A Circuit and Architecture Codesign Approach for a Hybrid CMOS

3D Integration for Novel Pixel Detectors: Current Status and Future Promise

Chapter 1: Fundamentals of Quantitative Design and Analysis

REPORT Moore's Law & the Value Transistor