Energy-Efficient Near-Threshold Standard Cell Library for Iot

Total Page:16

File Type:pdf, Size:1020Kb

Energy-Efficient Near-Threshold Standard Cell Library for Iot Energy-Efficient Near-Threshold Standard Cell Library for IoT Applications AbdelRahman Hesham1, Amin Nassar1, and Hassan Mostafa1,2 1Electronics and Communications Engineering Department, Cairo University, Egypt. 2University of Science and technology, Nanotechnology and Nanoelectronics Program, Zewail City of Science and Technology, October Gardens, 6th of October, Giza 12578, Egypt. [email protected], [email protected], [email protected] Abstract—In this paper, a low-energy minimum-area CMOS a much better energy/performance trade-off than subthreshold standard cell library suitable for IoT applications is proposed. operation [3]. Energy consumption reduction is achieved by operating the In addition to computation power and energy concerns in library in Near-Threshold Voltage (NTV) region, and by design- ing layout of cells at the minimum possible area for the used IoT applications, the possible threats deriving from widespread technology process. Body biasing technique is proposed to boost adoption of such a technology are stressed. As predicted by pMOS performance. Operating voltage and transistor sizing are Cisco, there will be 50 billion IoT connected devices by 2020 also selected to achieve the minimum energy consumption while [6]. Integration of such a tremendous number of devices into operating at the frequency range of 1MHz to 20MHz which is IoT potentially brings in a new concern, System Security. suitable for IoT applications. The proposed library was designed and characterized in UMC 130 nm CMOS technology process. Thus, cryptography became one of the main approaches to The library was modeled to be used in synthesis tools. To prove secure and overcome attacks on user data. Significant research the benefit for IoT applications, the library was benchmarked by effort was done to provide algorithmic level cryptography implementing 3 cryptographic algorithms: ASCON, AEGIS-128, solutions and to assess hardware implementation of these and AEZ. Synthesis results are showing that the three cores can algorithms. In [7], ASIC implementation of 3 commonly used operate at 18 MHz, 14 MHz, and 16 MHz respectively, while consuming 0.466 pJ, 3.006 pJ, and 5.064 pJ. cryptography algorithms is conducted to check their suitability Index Terms—CMOS digital integrated circuits, design for IoT applications. Similarly, architectural solutions are methodology, near-threshold CMOS circuits, ultra-low power proposed in [8] and used to implement co-processor suitable design, ultra-low energy, IoT, ASCON, AEGIS, AEZ for Narrow-Band (NB) IoT devices. Also, in this direction and in 2012, the European Network of Excellence in Cryptology I. INTRODUCTION (ECRYPT) called for a new competition for authenticated The Internet of Things (IoT) is a novel paradigm that is ciphers: CAESAR (Competition for Authenticated Encryption: rapidly gaining ground in the scenario of modern wireless Security, Applicability, and Robustness) [9]. Review of hard- communication. This is resulting in a wide range of applica- ware implementation of CAESAR algorithms are presented tions touching every aspect of our life, like wearable devices, in [10], [11]. The winning algorithms were announced in connected cars, and smart homes. Enablement of IoT spread 2019, and they were covering 3 main areas of application: is depending on the reliability of devices accessing it. Device lightweight, high performance, and defense-in-depth. In this reliability is based on: (1) computation power, (2) efficiency of paper, 3 of these applications were explored for potential energy consumption, and (3) security against possible threats. benefit in IoT devices. Computation power of IoT devices has been widely ex- In this paper, a low-energy minimum-area standard cell plored in the literature [1]–[5], with a major interest in energy library is proposed in UMC 130 nm process. Library is reduction techniques. These techniques have been explored operating in NTV region at 350 mV supply. This supply was and proposed at all design levels; namely circuit, logic, RTL, selected to achieve the minimum Power Delay Product (PDP) algorithm, and system levels. For IoT devices, most of energy for the used process node. The minimum area is achieved saving is achieved by circuit-level solutions. One of the most by minimizing cell height and by utilizing Euler’s Path in growing topics in this area is the feasibility of voltage scaling design of each cell layout. A calculation method is proposed to reduce energy consumption. Circuit design in subthreshold to get minimum cell height as a function of technology region has gained increasing interest to achieve low power parameters. The Inverse Narrow Width Effect (INWE) was requirements by operating the circuits at the Minimum Energy checked and utilized for library PPA gains. The proposed Point (MEP) [1]. The drawback of this MEP paradigm is the library is showing better energy and area measurements com- significant loss in performance [2]. To recover some of the pared to other libraries surveyed. To show the gains from performance loss while maintaining the power gains, Near- utilizing NTV operation, the library is benchmarked against Threshold Voltage (NTV) operation was introduced [2]. It was a commercial library operating in Super-Threshold Voltage shown that in NTV, energy savings can be in the order of region to implement three of CAESAR finalists: ASCON [12], 10X, with only a 10X degradation in performance, providing AEGIS-128 [13], and AEZ [14]. The rest of the paper is organized as follows. Section II describes the library architecture design and discusses the pro- n = max(Hmin;F EOL;Hmin;BEOL)=Pm2 (4) posed solutions for improving library PPA. Then,comparison result with similar library is presented. Section III describes where n is the minimum cell architecture tracks for a given the flow used for designing the set of cells constructing technology. This equation is technology-independent, and can the library. Section IV reports the benchmark results of the be applied to planar CMOS process nodes. proposed library using cryptography IPs. Lastly, the paper is concluded in Section V. II. LIBRARY ARCHITECTURE DESIGN In digital design using standard-cell approach, all modules must have the same height [15], [16]. The placement of standard cells has to be aligned with some pre-specified standard-cell rows in the placement region. And because of its popularity, most placement algorithms assume a standard- cell design style. The minimum cell dimensions or in other words the minimum resolution of cell placement in 2D area is defined as Unit Tile. All the cells in standard-cell library have to be of –or multiples of– unit tile height and to have a width that is a multiple of unit tile width. A. Minimum Cell Height Design Here, we are proposing a method to define the minimum cell height for a given process node. All the drawings inside cell must meet the Design Rule Checks (DRC). And given that the shapes drawn inside any cell are from Front-End Of Line (FEOL) layers or Back-End Of Line (BEOL) layers, rules related to both layer groups must be considered. The process parameters needed for FEOL checks are: minimum diffusion enclosure inside NPLUS (ENn), minimum diffusion enclosure inside PPLUS (ENp), minimum diffusion width inside NPLUS (Wn), and minimum diffusion width inside PPLUS (Wp). From Fig. 1, the minimum cell height Fig. 1. Technology FEOL parameters to calculate minimum tracks. (Hmin,FEOL) is calculated as: Hmin;F EOL = 2ENp + 2ENp + Wn + Wp (1) For UMC 130nm process, and based on the process Design Rule Manual (DRM), the minimum cell height tracks is calcu- Then, to calculate H from BEOL rules, we need to construct min lated to be 5T. This is the minimum number of tracks reported the layout given in Fig. 2. The limitation here is coming from in the literature. In [17], a 6T architecture is developed, in the first metal layer (metal 1). (1) Any metal 1 shape must [18]–[21] 8T–12T architectures are developed by utilizing keep a spacing distance of S to other metal 1 shapes, (2) m1 multi-finger structure to achieve more area efficiency. any via connecting metal 1 to poly shape or diffusion shape (CONT) must be enclosed by metal 1 shape, and finally (3) PG B. Cell Layout Design rails needs to be drawn on metal 1 to provide supply voltage to transistors. The minimum PG rail width is the minimum Implementation of 5T architecture is enabled by: using metal 1 width (Wm1). So: Euler’s path theory for layout design, and using 3 metal layers for intra-cell routing. Hmin;BEOL = 4Sm1 + 3CEm1 + Wm1 (2) Euler’s path theory [22]–[24] is used in [25] to provide In order to avoid any routing problem, either internally to the minimum-area transistor placement inside given cell area. connect transistors, or externally when the cell gets connected The resulting transistor placement can be routed in multiple to other cells, cell height must be integer multiples of the pitch ways. The selected routing of each cell used in our library of the first horizontal metal layer -metal 2 in our proposed considers: (1) pin accessibility when used in Placement and Routing tools, (2) minimizing pin capacitance, and (3) meeting library- (Pm2). technology DRC rules. Three metal layers are used for intra- cell routing. Metal 1 and metal 2 are used extensively, while Hmin = nPm2 (3) metal 3 is rarely used when all metal 1 and metal 2 resources And from Equations (1), (2), (3), we can define: are used. Fig. 3. NMOS threshold voltage vs transistor width at different supply Fig. 2. Technology BEOL parameters to calculate minimum tracks. voltages in 130 nm. C. Transistor Sizing Transistor sizing is one of the main factors deciding the li- brary power and speed performance. The used 5T architecture sets a limit on the maximum transistor sizing: Wn + Wp ≤ 5Pm2 − 2ENp − 2ENn (5) For the used process, this limit is calculated to be 1.04 µM.
Recommended publications
  • Outline ECE473 Computer Architecture and Organization • Technology Trends • Introduction to Computer Technology Trends Architecture
    Outline ECE473 Computer Architecture and Organization • Technology Trends • Introduction to Computer Technology Trends Architecture Lecturer: Prof. Yifeng Zhu Fall, 2009 Portions of these slides are derived from: ECE473 Lec 1.1 ECE473 Lec 1.2 Dave Patterson © UCB Birth of the Revolution -- What If Your Salary? The Intel 4004 • Parameters – $16 base First Microprocessor in 1971 – 59% growth/year – 40 years • Intel 4004 • 2300 transistors • Initially $16 Æ buy book • Barely a processor • 3rd year’s $64 Æ buy computer game • Could access 300 bytes • 16th year’s $27 ,000 Æ buy cacar of memory • 22nd year’s $430,000 Æ buy house th @intel • 40 year’s > billion dollars Æ buy a lot Introduced November 15, 1971 You have to find fundamental new ways to spend money! 108 KHz, 50 KIPs, 2300 10μ transistors ECE473 Lec 1.3 ECE473 Lec 1.4 2002 - Intel Itanium 2 Processor for Servers 2002 – Pentium® 4 Processor • 64-bit processors Branch Unit Floating Point Unit • .18μm bulk, 6 layer Al process IA32 Pipeline Control November 14, 2002 L1I • 8 stage, fully stalled in- cache ALAT Integer Multi- Int order pipeline L1D Medi Datapath RF @3.06 GHz, 533 MT/s bus cache a • Symmetric six integer- CLK unit issue design HPW DTLB 1099 SPECint_base2000* • IA32 execution engine 1077 SPECfp_base2000* integrated 21.6 mm L2D Array and Control L3 Tag • 3 levels of cache on-die totaling 3.3MB 55 Million 130 nm process • 221 Million transistors Bus Logic • 130W @1GHz, 1.5V • 421 mm2 die @intel • 142 mm2 CPU core L3 Cache ECE473 Lec 1.5 ECE473 19.5mm Lec 1.6 Source: http://www.specbench.org/cpu2000/results/ @intel 2006 - Intel Core Duo Processors for Desktop 2008 - Intel Core i7 64-bit x86-64 PERFORMANCE • Successor to the Intel Core 2 family 40% • Max CPU clock: 2.66 GHz to 3.33 GHz • Cores :4(: 4 (physical)8(), 8 (logical) • 45 nm CMOS process • Adding GPU into the processor POWER 40% …relative to Intel® Pentium® D 960 When compared to the Intel® Pentium® D processor 960.
    [Show full text]
  • The Advantages of FRAM-Based Smart Ics for Next Generation Government Electronic Ids
    The Advantages of FRAM-Based Smart ICs for Next Generation Government Electronic IDs By Joseph Pearson and Dr. Ted Moise Texas Instruments, Inc. September 27, 2007 Executive Summary Electronic versions of passports and other government-issued identification documents use an Integrated Circuit (IC) or chip to establish a digital link between the holder and personal biometric information, such as a digitized photo, fingerprint or iris image. Designed to enhance border, physical and IT security, electronic chips ensure that the person holding a passport or government document is the one to whom it was legitimately issued. The next generation ICs will employ an advanced embedded memory technology, called FRAM (Ferroelectric Random Access Memory), which considerably improves the speed and reliability of future smart, secure e-passports and government ID documents. More than 50 countries have electronic passport (e-passport) programs, and many countries are also putting in place more secure forms of electronic citizen, visitor and government employee identification. As the volume of document issuance increases and new security threats occur, there is an increased need for industry-standard, next-generation contactless smart IC solutions that securely store, process and communicate data. These new smart ICs will have increased writing speeds to produce and process documents faster and more efficiently, as well as enhanced memory for future security requirements. When FRAM is manufactured at the 130 nanometer semiconductor process node, and embedded in a smart IC, it surpasses the limitations of current Electrically Erasable Programmable Read-Only Memory (EEPROM) and other memory technologies used in government ID applications. The imminent introduction of this innovative memory technology for smart ICs signals a shift in performance in smart card applications deployed in government electronic ID documents.
    [Show full text]
  • Nano-Cmos Scaling Problems and Implications
    CHAPTER 1 NANO-CMOS SCALING PROBLEMS AND IMPLICATIONS 1.1 DESIGN METHODOLOGY IN THE NANO-CMOS ERA As process technology scales beyond 100-nm feature sizes, for functional and high-yielding silicon the traditional design approach needs to be modified to cope with the increased process variation, interconnect processing difficulties, and other newly exacerbated physical effects. The scaling of gate oxide (Figure 1.1) in the nano-CMOS regime results in a significant increase in gate direct tunnel- ing current. Subthreshold leakage and gate direct tunneling current (Figure 1.2) are no longer second-order effects [1,15]. The effect of gate-induced drain leak- age (GIDL) will be felt in designs, such as DRAM (Chapter 7) and low-power SRAM (Chapter 9), where the gate voltage is driven negative with respect to the source [15]. If these effects are not taken care of, the result will be a nonfunctional SRAM, DRAM, or any other circuit that uses this technique to reduce subthresh- old leakage. In some cases even wide muxes and flip-flops may be affected. Subthreshold leakage and gate current are not the only issues that we have to deal with at a functional level, but also the power management of chips for high-performance circuits such as microprocessors, digital signal processors, and graphics processing units. Power management is also a challenge in mobile applications. Furthermore, optical lithography will be stretched to the limit even when en- hanced resolution extension technologies (RETs) are employed. These techniques Nano-CMOS Circuit and Physical Design, by Ban P. Wong, Anurag Mittal, Yu Cao, and Greg Starr ISBN 0-471-46610-7 Copyright © 2005 John Wiley & Sons, Inc.
    [Show full text]
  • 3D Massively Parallel Processor with Stacked Memory)
    112 IEEE TRANSACTIONS ON COMPUTERS, VOL. 64, NO. 1, JANUARY 2015 Design and Analysis of 3D-MAPS (3D Massively Parallel Processor with Stacked Memory) Dae Hyun Kim, Krit Athikulwongse, Michael B. Healy, Mohammad M. Hossain, Moongon Jung, Ilya Khorosh, Gokul Kumar, Young-Joon Lee, Dean L. Lewis, Tzu-Wei Lin, Chang Liu, Shreepad Panth, Mohit Pathak, Minzhen Ren, Guanhao Shen, Taigon Song, Dong Hyuk Woo, Xin Zhao, Joungho Kim, Ho Choi, Gabriel H. Loh, Hsien-Hsin S. Lee, and Sung Kyu Lim Abstract—This paper describes the architecture, design, analysis, and simulation and measurement results of the 3D-MAPS (3D massively parallel processor with stacked memory) chip built with a 1.5 V, 130 nm process technology and a two-tier 3D stacking technology using 1.2 μ -diameter, 6 μ -height through-silicon vias (TSVs) and μ -diameter face-to-face bond pads. 3D-MAPS consists of a core tier containing 64 cores and a memory tier containing 64 memory blocks. Each core communicates with its dedicated 4KB SRAM block using face-to-face bond pads, which provide negligible data transfer delay between the core and the memory tiers. The maximum operating frequency is 277 MHz and the maximum memory bandwidth is 70.9 GB/s at 277 MHz. The peak measured memory bandwidth usage is 63.8 GB/s and the peak measured power is approximately 4 W based on eight parallel benchmarks. Index Terms—3D Multiprocessor-memory stacked systems, 3D integrated circuits, Computer-aided design, RTL implementation and simulation 1INTRODUCTION HREE-DIMENSIONAL integrated circuits (3D ICs) are ex- circuit components built with different technologies can be Tpected to provide numerous benefits.
    [Show full text]
  • WP Microelectronics and Interconnections
    Advanced European Infrastructures for Detectors at Accelerators FirSecondst Ann Annualual MeMeetingeting: WPWP44 St Statusatus R Reporteport ChrisChristophetophe de LA TA ILdeLE (laCN TailleRS/IN2P(CNRS/IN2P3)3) Valerio RE (INFN ) Valerio Re (INFN/Univ. Bergamo) LPNHE, Paris, April 7, 2017 DESY, Hamburg, June 17, 2016 This project has received funding from the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement no. 654168. WP4: microelectronics and interconnections • WP Coordinators: Christophe de la Taille, Valerio Re • Goal : provide chips and interconnections to detectors developed by other WPs • Task 1: Scientific coordination (CNRS-OMEGA, INFN-UNIBG) • Task 2 : 65 nm chips for trackers (CERN) • Fine pitch, low power, advanced digital processing • Task 3 : SiGe 130nm for calorimeters/gaseous (IN2P3) • Highly integrated charge and time measurement • Task 4 : interconnections between 65 nm chips and pixel sensors (INFN) • TSVs in 65 nm CMOS wafers, bonding of 65 nm chips to sensors, exploration of fine pitch bonding processes AIDA WP4 milestones MS4.1 Architectural review of deliverable chips in 65nm run M14 (accomplished) MS4.2 Final design review of 65nm M30 (October 2017) MS4.3 Test report of deliverable D4.1 M46 (February 2019) MS4.4 Selection of SiGe foundry M14 (accomplished) MS4.5 Final design review of deliverable chips in SiGe run M30 (October 2017) MS4.6 Test report of deliverable D4.2 M46 MS4.7 Selection of TSV process M14 (accomplished) MS4.8 Final design review of deliverable D4.3 (TSV
    [Show full text]
  • Madison Processor
    TheThe RoadRoad toto BillionBillion TransistorTransistor ProcessorProcessor ChipsChips inin thethe NearNear FutureFuture DileepDileep BhandarkarBhandarkar RogerRoger GolliverGolliver Intel Corporation April 22th, 2003 Copyright © 2002 Intel Corporation. Outline yy SemiconductorSemiconductor TechnologyTechnology EvolutionEvolution yy Moore’sMoore’s LawLaw VideoVideo yy ParallelismParallelism inin MicroprocessorsMicroprocessors TodayToday yy MultiprocessorMultiprocessor SystemsSystems yy TheThe PathPath toto BillionBillion TransistorsTransistors yy SummarySummary ©2002, Intel Corporation Intel, the Intel logo, Pentium, Itanium and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries *Other names and brands may be claimed as the property of others BirthBirth ofof thethe RevolutionRevolution ---- TheThe IntelIntel 40044004 IntroducedIntroduced NovemberNovember 15,15, 19711971 108108 KHz,KHz, 5050 KIPsKIPs ,, 23002300 1010μμ transistorstransistors 20012001 –– Pentium®Pentium® 44 ProcessorProcessor Introduced November 20, 2000 @1.5 GHz core, 400 MT/s bus 42 Million 0.18µ transistors August 27, 2001 @2 GHz, 400 MT/s bus 640 SPECint_base2000* 704 SPECfp_base2000* SourceSource:: hhtttptp:/://www/www.specbench.org/cpu2000/results/.specbench.org/cpu2000/results/ 3030 YearsYears ofof ProgressProgress yy40044004 toto PentiumPentium®® 44 processorprocessor –– TransistorTransistor count:count: 20,000x20,000x increaseincrease –– Frequency:Frequency: 20,000x20,000x increaseincrease
    [Show full text]
  • Low-Noise Design Issues for Analog Front-End Electronics in 130 Nm and 90 Nm CMOS Technologies
    Low-noise Design Issues for Analog Front-end Electronics in 130 nm and 90 nm CMOS Technologies M. Manghisoni a, c , L. Ratti b, c , V. Re a, c , V. Speziali b, c , G. Traversi a, c a Università di Bergamo, Dipartimento di Ingegneria Industriale, Viale Marconi, 5, 24044 Dalmine, Italy b Università di Pavia, Dipartimento di Elettronica, Via Ferrata, 1, 27100 Pavia, Italy c INFN, Sezione di Pavia, Via Bassi, 6, 27100 Pavia, Italy [email protected] Abstract belonging to two commercial CMOS processes with minimum feature size of 130 nm and 90 nm manufactured by Deep sub-micron CMOS technologies provide well- STMicroelectronics. The devices were characterized at drain established solutions to the implementation of low-noise currents from several tens of µA to 1 mA, that is, the usual front-end electronics in various detector applications. The IC operating currents of input devices in integrated charge- designers’ effort is presently shifting to 130 nm CMOS sensitive amplifiers. In these conditions, deep sub-micron technologies, or even to the next technology node, to devices are biased in weak or moderate inversion. The implement readout integrated circuits for silicon strip and behavior of the main noise parameters modelling the 1/f and pixel detectors, in view of future HEP applications. In this white noise terms is studied as a function of the device work the results of noise measurements carried out on CMOS polarity and of the gate length and width to account for devices in 130 nm and 90 nm commercial processes are different detector requirements. The wide set of presented.
    [Show full text]
  • 45 Nm Process
    INTEL FIRST TO DEMONSTRATE WORKING 45 nm CHIPS New Technology Will Improve Energy Efficiency and Boost Capabilities of Future Intel Platforms Mark Bohr Intel Senior Fellow Director of Process Architecture & Integration January 2006 1 65 nm Status • Announced shipping 65nm for revenue in Oct. 2005 • Two 65nm/300mm fabs shipping in volume (D1D and Fab 12); with two more coming in 2006 • Intel has shipped more than a million dual- core processors made on 65nm process technology • CPU shipment cross-over from 90nm to 65nm projected for Q3/06 2 What are We Announcing Today? • Intel is first to reach an important milestone in the development of 45 nm logic technology • Fully functional 153 Mbit SRAM chips have been made with >1 billion transistors each • The memory cell size on these SRAM chips is 0.346 μm2, almost half the size of the 65 nm cell • This milestone demonstrates that Intel is on track for delivery of its 45 nm logic technology in 2H 2007 3 45 nm Technology Benefits Compared to today’s 65 nm technology, the 45 nm technology will provide the following product benefits: ~2x improvement in transistor density, for either smaller chip size or increased transistor count >20% improvement in transistor switching speed or >5x reduction in leakage power >30% reduction in transistor switching power This process technology will provide the foundation to deliver improved performance/Watt that will enhance the user experience 4 Intel's Logic Technology Evolution Process Name P1262 P1264 P1266 P1268 Lithography 90 nm 65 nm 45 nm 32 nm 1st Production
    [Show full text]
  • A Circuit and Architecture Codesign Approach for a Hybrid CMOS
    IEEE TRANSACTIONS ON NANOTECHNOLOGY, VOL. 10, NO. 3, MAY 2011 385 A Circuit and Architecture Codesign Approach for a Hybrid CMOS–STTRAM Nonvolatile FPGA Somnath Paul, Student Member, IEEE, Saibal Mukhopadhyay, Member, IEEE, and Swarup Bhunia, Senior Member, IEEE Abstract—Research efforts to develop a novel memory tech- divided into two classes: 1) SRAM-based FPGAs (such as Xil- nology that combines the desired traits of nonvolatility, high en- inx Virtex), which employ SRAM cells to store the configura- durance, high speed, and low power have resulted in the emer- tion and 2) flash and antifuse-based FPGAs (such as Actel and gence of spin–torque transfer RAM (STTRAM) as a promising next-generation universal memory. Although industrial efforts Quicklogic), which use nonvolatile memory for storing config- have been made to design efficient embedded memory arrays uration [23]. While the former enjoys the advantage of standard using STTRAM, the prospect of developing a nonvolatile field- fabrication flow and the capability for multiple reconfiguration, programmable gate array (FPGA) framework with STTRAM ex- the latter offers the advantages of fast start-up and minimal ploiting its high integration density remains largely unexplored. reconfiguration power [2]. Clearly, the most desirable FPGA In this paper, we propose a novel CMOS–STTRAM hybrid FPGA framework, identify the key design challenges, and propose opti- platform is the one that combines the benefit of both worlds. mization techniques at circuit, architecture, and application map- FPGA frameworks can, therefore, benefit greatly from emerging ping levels. We show that intrinsic properties of STTRAM that dis- memory technologies that combine the best features of current tinguish it from conventional static RAM (SRAM), such as asym- volatile and nonvolatile memories in a fabrication technology × metric readout power, where a cell storing “0” has 5 less read compatible with CMOS process flows.
    [Show full text]
  • 3D Integration for Novel Pixel Detectors: Current Status and Future Promise
    3D integration for novel pixel detectors: current status and future promise Valerio Re Università di Bergamo Sezione di Pavia Valerio Re – XCIX Congresso Nazionale SIF – Trieste – September 24, 2013 Outline • What is 3D integration? Why did it trigger a wide interest in our community? • Evolution from 2D devices to 3D integration for advanced pixel sensors and readout electronics in high energy physics experiments • Expectations, experience, plans: current status and future promise (a personal view) Valerio Re – XCIX Congresso Nazionale SIF – Trieste – September 24, 2013 2 Advancing the state of the art of pixel sensors for a next generation of HEP experiments New demanding specifications for experiments at new machines (HL-LHC, International Linear Collider,…): • Improve resolution shrink pixel size and pitch, down to 20 mm or even less • Preserve or even increase pixel-level electronic functions handling of high data rates (hit rates > 10 MHz/mm2), analog-to digital conversion, sparsification, intelligent data processing…: presently this also contributes to limiting the minimum size of pixel readout cells • Decrease amount of material thin sensor and electronics chips, “zero mass” cooling Necessary to reduce errors in track reconstruction due to multiple scatterings of particles in the detector system 50 -100 mm total thickness Valerio Re – XCIX Congresso Nazionale SIF – Trieste – September 24, 2013 3 New ideas for 3D integration in photon science Potential benefits of new microelectronic technologies to future detector systems for X-ray imaging at free electron laser facilities: . Reduction of pixel size (100x100 mm2 or even less), presently limited by the need of complex electronic functions in the pixel cell .
    [Show full text]
  • Chapter 1: Fundamentals of Quantitative Design and Analysis
    Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis Copyright © 2012, Elsevier Inc. All rights reserved. 1 Contents 1. Introduction 2. Classes of computers 3. Trends in computer architecture 4. Parallelism 5. Power and energy 6. Chip fabrication costs 7. Benchmarks 8. Principles of computer design 9. Fallacies and pitfalls 10. Evolution of supercomputers 11. Problem solving Copyright © 2012, Elsevier Inc. All rights reserved. 2 Introduction Computer technology Performance improvements: Improvements in semiconductor technology Feature size, clock speed Improvements in computer architectures Enabled by HLL compilers, UNIX Lead to RISC architectures Together have enabled: Lightweight computers Productivity-based managed/interpreted programming languages Copyright © 2012, Elsevier Inc. All rights reserved. 3 Introduction Single processor performance Move to multi-processor RISC Copyright © 2012, Elsevier Inc. All rights reserved. 4 Introduction Current trends in architecture Cannot continue to leverage Instruction-Level parallelism (ILP) Single processor performance improvement ended in 2003. Why? New models for performance: Data-level parallelism (DLP) Thread-level parallelism (TLP) Request-level parallelism (RLP) The new models for performance require explicit restructuring of the application. No more free lunch for application developers!!! Copyright © 2012, Elsevier Inc. All rights reserved. 5 Classes of Computers Classes Classes of computers 1. Personal Mobile Device (PMD) 1. e.g. smart phones, tablet computers 2. Emphasis on energy efficiency and real-time 2. Desktop Computers 1. Emphasis on price-performance 3. Servers 1. Emphasis on availability, scalability, throughput 4. Clusters / Warehouse Scale Computers (WSCs) 1. Used for “Software as a Service (SaaS)” 2. Emphasis on availability and price-performance 3.
    [Show full text]
  • REPORT Moore's Law & the Value Transistor
    GILDER December 2002 / Vol.VII No.12 TECHNOLOGY Moore’s Law & REPORT The Value Transistor he big foundries, Taiwan Semiconductor Manufacturing Corporation (TSM) and United Microelectronics Corporation (UMC), have curtailed capital Texpenditures and are pushing out adoption dates for 300-mm (diameter) wafer production. The biggest integrated device manufacturers (IDMs), IBM, Intel (INTC), Samsung, and Texas Instruments (TXN), continue to spend on process development and are moving to 300-mm wafers. Conventional wisdom says that as the economy recovers, IDMs will be prepared for the upturn. And foundries, with lagging semicon- ductor processes that produce slower chips at higher costs, will lose market share. In the early days, foundries lagged IDMs by at least a couple of process genera- tions. Then they caught up with the IDMs—introducing large wafers and leading- “I thought Moore’s law edge processes right along with the major IDMs. Now, foundries seem about to fall drove the industry, behind. What is going on? Moore’s law is the rate of semiconductor manufacturing improvement—the num- until Tredennick and ber of transistors in a fixed area doubles every eighteen months. Big chips are more capa- ble; fixed-size functions fit on smaller chips. The magic of Moore’s-law progress comes Shimamoto introduced from three areas. Most important is shrinking transistor size. The second contributor is me to the value increasing chip and wafer size. Third is better circuit design. Chips get faster and cheap- er with manufacturing process improvements. If you find the current chips lacking, wait transistor. Here it is, a generation or two and they’ll have what you need.
    [Show full text]