Aadam: a Fast, Accurate, and Versatile Aging-Aware Cell Library Delay Model Using Feed-Forward Neural Network

Total Page:16

File Type:pdf, Size:1020Kb

Aadam: a Fast, Accurate, and Versatile Aging-Aware Cell Library Delay Model Using Feed-Forward Neural Network Aadam: A Fast, Accurate, and Versatile Aging-Aware Cell Library Delay Model using Feed-Forward Neural Network Seyed Milad Ebrahimipour1, Behnam Ghavami2, Hamid Mousavi1, Mohsen Raji3 Zhenman Fang2, Lesley Shannon2 1Shahid Bahonar University of Kerman, {miladebrahimi, hamidmousavi0}@eng.uk.ac.ir 2Simon Fraser University, {behnam_ghavami, zhenman, lesley_shannon}@sfu.ca 3Shiraz University, raji@shirazu.ac.ir ABSTRACT 1 INTRODUCTION With the CMOS technology scaling, transistor aging has become With the CMOS technology scaling, the reliability of circuits has one major issue affecting circuit reliability and lifetime. There are become one of the major issues affecting digital circuit designs1 [ , 7]. two major classes of existing studies that model the aging effects In addition to the correct functionality of the circuit, a longtime in the circuit delay. One is at transistor-level, which is highly ac- lifespan is also crucial in many application fields such as aerospace, curate but very slow. The other is at gate-level, which is faster but defense, and medical industries [5, 13]. Transistor aging is a key less accurate. Moreover, most prior studies only consider a limited source of failure that threatens the lifetime reliability of digital subset or limited value ranges of aging factors. circuits. It leads to a degradation of the electrical characteristics of In this paper, we propose Aadam, a fast, accurate, and versatile transistors and subsequently, a considerable increase of the device aging-aware delay model for generic cell libraries. In Aadam, we delay [19]. For example, the Negative Bias Temperature Instability first use transistor-level SPICE simulations to accurately charac- (NBTI) aging phenomenon, a major parametric reliability issue, terize the delay degradation of each library cell under a versatile may increase the circuit delay by up to 30% [14]. This impact may set of aging factors, including both physical parameters (i.e., initial eventually lead to violations of circuit timing constraints, reduction threshold voltage and transistor width/length ratio) and operat- of mean time to failure, and faster wear-out of the system. ing conditions (i.e., working temperature, signal probability, input To steer clear of the aging effects and guarantee the correct signal slew range, output load capacitance range, and projected functionality of the circuit for the projected lifetime (C), designers lifetime). For each library cell, we then train a feed-forward neu- have to include a safety timing margin called timing guard band in ral network (FFNN) to learn the relation between the input aging the design [6, 27]. This timing guard band may decrease the circuit factors and output cell delay degradation. Therefore, for a given frequency, as shown in Equation 1. input circuit and a given combination of aging factors, we can use 1 the trained FFNNs to quickly and accurately infer the delay degra- 5 A4@D4=2~ = ; 퐷 ¹Cº = 퐷 ¹0º ¸ 퐷 ¹Cº (1) 퐷 ¹Cº 퐺퐵 dation for each gate in the circuit. Finally, to effectively estimate the aging-aware lifetime delay of large-scale circuits, we also inte- where 5 A4@D4=2~ denotes the targeted frequency of the design and grate Aadam into a state-of-the-art static timing analysis tool called 퐷 ¹Cº represents the lifetime delay. 퐷 ¹0º and 퐷퐺퐵 ¹Cº respectively OpenTimer. Experimental results demonstrate that Aadam achieves denote the initial delay of the circuit and the delay of the tim- fast estimation of the aging-induced delay with high accuracy close ing guard band. To reduce the performance overhead imposed by to transistor-level simulation. 퐷퐺퐵 ¹Cº, it is essential to accurately estimate the aging-induced delay degradation of the circuit and apply a minimum 퐷퐺퐵 ¹Cº. KEYWORDS However, it is nontrivial to quickly and accurately estimate the Aging, Reliability, Delay Model, Machine Learning, Cell Library aging-induced delay degradation as there are many factors that affect the delay degradation. They include both physical parame- ACM Reference Format: ters (i.e., initial threshold voltage and transistor width/length ratio) Seyed Milad Ebrahimipour1, Behnam Ghavami2, Hamid Mousavi1, and operating conditions (i.e., working temperature, signal proba- Mohsen Raji3 and Zhenman Fang2, Lesley Shannon2. 2020. Aadam: A Fast, Accurate, and Versatile Aging-Aware Cell Library Delay Model using bility, input signal slew range, output load capacitance range, and Feed-Forward Neural Network. In IEEE/ACM International Conference on projected lifetime), which will be explained in detail in Section 2.1.2. Computer-Aided Design (ICCAD ’20), November 2–5, 2020, Virtual Event, USA. As will be detailed in Section 2.2, the prior studies that model ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3400302.3415605 the aging effects in the circuit delay can be divided into thefol- lowing categories. The first category of work [34] uses full-circuit Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed transistor-level SPICE simulation to achieve high accuracy, but runs for profit or commercial advantage and that copies bear this notice and the full citation at very slow speed. The second category of work [2, 28–30, 37, 40] on the first page. Copyrights for components of this work owned by others than ACM uses lookup table (LUT) based gate-level models to achieve faster must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a speed at the expense of lower accuracy. The third category of work fee. Request permissions from permissions@acm.org. [16, 26] only performs SPICE simulation for a set of critical paths, ICCAD ’20, November 2–5, 2020, Virtual Event, USA but is still slow for large-scale circuits. The last category of work © 2020 Association for Computing Machinery. ACM ISBN 978-1-4503-8026-3/20/11...$15.00 [20, 25, 38, 39] starts to use traditional machine learning techniques https://doi.org/10.1145/3400302.3415605 such as support vector machine (SVM) and non-linear regression ICCAD ’20, November 2–5, 2020, Virtual Event, USA Seyed Milad Ebrahimipour1, Behnam Ghavami2, Hamid Mousavi1, Mohsen Raji3 and Zhenman Fang2, Lesley Shannon2 model to predict the delay caused by the NBTI effect at gate-level or the channel and degrades the transistor’s mobility ¹`º. Second, critical path level. In fact, most prior studies only consider a limited because of the non-epitaxial structure of (8$2, the vertical electric subset of aging factors and/or limited value ranges of aging factors. field leads to defect generation in the interface of the transistor To address those issues in prior studies, in this paper, we propose and the formation of defects inside the (8$2. These defects cause Aadam, an accurate yet fast aging-aware delay model for generic accumulated charges around and within the gate dielectric, which cell libraries by considering a versatile set of aging factors that are ultimately increases the threshold voltage ¹+Cℎº of the transistor. summarized in Section 2.1.2. The major idea behind Aadam is to For a projected lifetime (C), the delay of a transistor ¹퐷¹Cºº is leverage 1) the high accuracy of transistor-level SPICE simulation inversely proportional to its drain current ¹퐼퐷 ¹Cºº, and 퐼퐷 ¹Cº is a to characterize the delay degradation of library cells only; 2) the function of the transistor’s mobility ¹`¹Cºº and threshold voltage learning power of feed-forward neural networks (FFNN) [18] to ¹+Cℎ ¹Cºº, as shown in Equation 2 (where +33 denotes the supply accurately and quickly predict the aging-induced delay at gate- voltage). As a result, with `¹Cº decreasing and +Cℎ ¹Cº increasing level under the versatile combinations of aging effects; and 3) the during the operation of the transistor, 퐼퐷 ¹Cº decreases and the fast aging-aware static timing analysis (STA) of large-scale circuits transistor delay ¹퐷¹Cºº increases. using the state-of-the-art STA tool called OpenTimer [21]. 1 `¹Cº Aadam provides a fully automated framework for aging-aware 퐷¹Cº / ; 퐼 ¹Cº ≈ ¹+ − + ¹Cºº2 (2) 퐼 ¹Cº 퐷 2 33 Cℎ STA of large-scale circuits. In Aadam, we first characterize the 퐷 delay degradation of each library cell under a versatile set of aging 2.1.2 Gate-Level. Each gate in the circuit is exposed to different factors as mentioned earlier, using the accurate transistor-level operating conditions that can lead to different aging-induced delay SPICE simulations. Based on these characterized data, we train an degradations over the projected lifetime (t). FFNN to learn the relation between the input aging factors and 1. The working temperature (T) has a great impact on the trap output delay degradation for each library cell. As a result, these generation of transistors in a gate. As ) increases, the rate of trained FFNNs can quickly and accurately predict the aging-induced interface and oxide trap generation increases [3, 4], i.e., +Cℎ ¹Cº delay at gate-level. Finally, we integrate these trained FFNN models increases faster and `¹Cº decreases faster. This includes both to a state-of-the-art STA tool called OpenTimer [21] to effectively Positive Bias Temperature Instability (PBTI) effect in NMOS estimate the aging-aware lifetime delay of large-scale circuits. We transistors and Negative Bias Temperature Instability (NBTI) plan to release our Aadam toolflow to the public in the near future. effect in PMOS transistors. In our experiments, we use the open-source Nangate 45nm 2. The signal probability (_) also has a great impact on the trap generic cell library [31], and a few circuits from the ISCAS’85 [10], generation.
Loading...
Loading...
Loading...
Loading...
Loading...

4 pages remaining, click to load more.

Recommended publications
  • UCLA Electronic Theses and Dissertations
    UCLA UCLA Electronic Theses and Dissertations Title Implications of Modern Semiconductor Technologies on Gate Sizing Permalink https://escholarship.org/uc/item/56s9b2tm Author Lee, John Publication Date 2012 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California University of California Los Angeles Implications of Modern Semiconductor Technologies on Gate Sizing A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Electrical Engineering by John Hyung Lee 2012 c Copyright by John Hyung Lee 2012 Abstract of the Dissertation Implications of Modern Semiconductor Technologies on Gate Sizing by John Hyung Lee Doctor of Philosophy in Electrical Engineering University of California, Los Angeles, 2012 Professor Puneet Gupta, Chair Gate sizing is one of the most flexible and powerful methods available for the timing and power optimization of digital circuits. As such, it has been a very well-studied topic over the past few decades. However, developments in modern semiconductor technologies have changed the context in which gate sizing is performed. The focus has shifted from custom design methods to standard cell based designs, which has been an enabler in the design of modern, large-scale designs. We start by providing benchmarking efforts to show where the state-of-the-art is in standard cell based gate sizing. Next, we develop a framework to assess the impact of the limited precision and range available in the standard cell library on the power-delay tradeoffs. In addition, shrinking dimensions and decreased manufacturing process control has led to variations in the performance and power of the resulting designs.
    [Show full text]
  • Predictive Aging of Reliability of Two Delay Pufs
    Predictive Aging of Reliability of two Delay PUFs Naghmeh Karimi1, Jean-Luc Danger2;3, Florent Lozac'h3, and Sylvain Guilley2;3 1 ECE Department, Rutgers University, Piscataway, NJ, USA 08854 naghmeh.karimi@rutgers.edu 2 LTCI, CNRS, T´el´ecom ParisTech, Universit´eParis-Saclay, 75013 Paris, France firstname.lastname@telecom-paristech.fr 3 Secure-IC SAS, 35510 Cesson-S´evign´e,France firstname.lastname@secure-ic.com Abstract. To protect integrated circuits against IP piracy, Physically Unclonable Functions (PUFs) are deployed. PUFs provide a specific signature for each integrated circuit. However, environmental variations, (e.g., temperature change), power supply noise and more influential IC aging affect the functionally of PUFs. Thereby, it is important to evaluate aging effects as early as possible, preferentially at design time. In this paper we investigate the effect of aging on the stability of two delay PUFs: arbiter-PUFs and loop-PUFs and analyze the architectural impact of these PUFS on reliability decrease due to aging. We observe that the reliability of the arbiter-PUF gets worse over time, whereas the reliability of the loop-PUF remains constant. We interpret this phenomenon by the asymmetric aging of the arbiter, because one half is active (hence aging fast) while the other is not (hence aging slow). Besides, we notice that the aging of the delay chain in the arbiter-PUF and in the loop-PUF has no impact on their reliability, since these PUFs operate differentially. 1 Introduction With the advancement of VLSI technology, people are increasingly relying on electronic devices and in turn integrated circuits (ICs).
    [Show full text]
  • A Fine-Grained 3D IC Technology with NP-Dynamic Logic
    Received 13 June 2016; revised 24 February 2017; accepted 8 March 2017. Date of publication 20 March 2017; date of current version 7 June 2017. Digital Object Identifier 10.1109/TETC.2017.2684781 NP-Dynamic Skybridge: A Fine-Grained 3D IC Technology with NP-Dynamic Logic JIAJUN SHI, MINGYU LI, MOSTAFIZUR RAHMAN, SANTOSH KHASANVIS, AND CSABA ANDRAS MORITZ J. Shi, M. Li, and C.A. Moritz are with the Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA 01003 M. Rahman is with the School of Computing and Engineering, University of Missouri, Kansas City, MO 65211 S. Khasanvis is with BlueRISC Inc., Amherst, MA 01002 CORRESPONDING AUTHOR: J. SHI (jiajun@umass.edu) ABSTRACT A new 3D IC fabric named NP-Dynamic Skybridge is proposed that provides fine-grained vertical 3D integration for future technology scaling. Relying on a template of vertical nanowires, it expands our prior work to incorporate and utilize both n- and p-type transistors in a novel NP-Dynamic circuit-style compatible with true 3D integration. This enables a wide range of elementary logics leading to more compact circuits, simple clocking schemes for cascading logic stages and low buffer requirement. We detail new design concepts for larger-scale circuits, and evaluate our approach using a 4-bit nanoprocessor implemented in 16 nm technology node. A new pipelining scheme specifically designed for our 3D NP-Dynamic circuits is employed in the nanoprocessor. We compare our approach with 2D CMOS as well as state-of-the-art tran- sistor-level monolithic 3D IC (T-MI) approach. Benchmarking results for the 4-bit nanoprocessor show bene- fits of up to 56.7x density, 3.8x power and 1.7x throughput over 2D CMOS.
    [Show full text]
  • EMBEDDED PROGRAMMABLE LOGIC CORES By
    IMPLEMENTATION CONSIDERATIONS FOR “SOFT” EMBEDDED PROGRAMMABLE LOGIC CORES by James Cheng-Huan Wu B.A.Sc., University of British Columbia, 2002 A thesis submitted in partial fulfillment of the requirements for the degree of Master of Applied Science in The Faculty of Graduate Studies Department of Electrical and Computer Engineering We accept this thesis as conforming to the required standard: ___________________________ ___________________________ ___________________________ ___________________________ The University of British Columbia September 2004 © James C.H. Wu, 2004 ABSTRACT IMPLEMENTATION CONSIDERATIONS FOR “SOFT” EMBEDDED PROGRAMMABLE LOGIC CORES As integrated circuits become increasingly more complex and expensive, the ability to make post-fabrication changes will become much more attractive. This ability can be realized using programmable logic cores. Currently, such cores are available from vendors in the form of “hard” macro layouts. An alternative approach for fine-grain programmability is possible: vendors supply an RTL version of their programmable logic fabric that can be synthesized using standard cells. Although this technique may suffer in terms of speed, density, and power overhead, the task of integrating such cores is far easier than the task of integrating “hard” cores into an ASIC or SoC. When the required amount of programmable logic is small, this ease of use may be more important than the increased overhead. In this thesis, we identify potential implementation issues associated with such cores, and investigate in depth the area, speed and power overhead of using this approach. Based on this investigation, we attempt to improve the performance of programmable cores created in this manner. Using a test-chip implementation, we identify three main issues: core size selection, I/O connections, and clock-tree synthesis.
    [Show full text]
  • Eg, Nangate 15 Nm Standard Cell Library
    Nanosystem Design Kit (NDK): Transforming Emerging Technologies into Physical Designs of VLSI Systems G. Hills, M. Shulaker, C.-S. Lee, H.-S. P. Wong, S. Mitra Stanford Massachusetts University Institute of Technology Abundant-Data Explosion “Swimming in sensors, drowning in data” Wide variety & complexity Unstructured data 0 40K 0 ExaB (Billionsof GB) 2006 Year 2020 Mine, search, analyze data in near real-time Data centers, mobile phones, robots 2 Abundant-Data Applications Huge memory wall: processors, accelerators Energy Measurements Genomics classification Natural language processing 5% 18 % 0% 0% … 95 82 % % Compute Memory Intel performance counter monitors 2 CPUs, 8-cores/CPU + 128GB DRAM 3 US National Academy of Sciences (2011) 4 Computing Today 2-Dimensional 5 3-Dimensional Nanosystems Computation immersed in memory 6 3-Dimensional Nanosystems Computation immersed in memory Increased functionality Fine-grained, Memory ultra-dense 3D Computing logic Impossible with today’s technologies 7 Enabling Technologies 3D Resistive RAM Massive storage No TSV 1D CNFET, 2D FET Compute, RAM access thermal STT MRAM Ultra-dense, Quick access fine-grained 1D CNFET, 2D FET vias Compute, RAM access thermal 1D CNFET, 2D FET Silicon Compute, Power, Clock compatible thermal 8 Nanosystems: Compact Models Essential 3D Resistive RAM nanohub Massive storage 1D CNFET, 2D FET Compute, RAM access thermal STT MRAM Quick access m-Cell 1D CNFET, 2D FET Compute, RAM access thermal 1D CNFET, 2D FET Compute, Power, Clock thermal 9 Compact Models: Insufficient Alone Design for Realistic Systems Wire parasitics Inter-module interface circuits Routing congestion Application-dependent workloads Multiple clock domains Cache architecture Memory access patterns Processor vs.
    [Show full text]
  • Timing Analysis of Integrated Circuits
    UNIVERSITY OF THESSALY SCHOOL OF ENGINEERING Department of Computer & Communication Engineering TIMING ANALYSIS OF INTEGRATED CIRCUITS Master Thesis : Alexandros Mittas Lazaridis July 2012 Volos - Greece 1 Institutional Repository - Library & Information Centre - University of Thessaly 08/12/2017 17:59:09 EET - 137.108.70.7 DEDICATION To my parents and friends Hope for better days. 2 Institutional Repository - Library & Information Centre - University of Thessaly 08/12/2017 17:59:09 EET - 137.108.70.7 ACKNOWLEDGMENTS First I would like to thank Dr. George Stamoulis for advising me for the last 4 years. I have learned many things from him and consider myself fortunate to have been one of his students. I would also like to thank Dr. Nestoras Eymorfopoulos and Dr. Ioannis Moudanos. Without their patience and crucial support this thesis would not have been completed. Finally I am really grateful to my roommates in E5 room of Glavani Steet whose help was really appreciated. Konstantis, Giorgos, Babis, Tasos, Sofia, and Alexia I am really obliged. Forgive me if having forgotten to mention anyone. 3 Institutional Repository - Library & Information Centre - University of Thessaly 08/12/2017 17:59:09 EET - 137.108.70.7 1 Introduction _________________________________ 6 1.1 Goal of this Thesis.............................................................................................6 1.2 Moore’s Law……………………..……………………………..…………………………………………..6 2 Timing Analysis 8 2.1 What is Timing Analysis......................................................................................8
    [Show full text]
  • Standard Cell Library Design and Optimization Methodology for ASAP7
    Standard Cell Library Design and Optimization Methodology for ASAP7 PDK (Invited Paper) Xiaoqing Xu1, Nishi Shah1, Andrew Evans1, Saurabh Sinha1,BrianCline1,andGregYeric1 1ARM Inc., Austin, TX, USA xiaoqing.xu,nishi.shah,andrew.evans,saurabh.sinha,brian.cline,greg.yeric @arm.com { } ABSTRACT explore the impact of lithography design rules and SC archi- Standard cell libraries are the foundation for the entire back- tectures [2,7–9]. However, very few SC design and optimiza- end design and optimization flow in modern application-specific tion techniques are discussed for complex logic/sequential integrated circuit designs. At 7nm technology node and be- cells. More importantly, none of them are publicly available, yond, standard cell library design and optimization is becom- which prevents academic researchers from exploring novel SC ing increasingly difficult due to extremely complex design design and optimization techniques. constraints, as described in the ASAP7 process design kit Therefore, this study proposes a set of design and opti- (PDK). Notable complexities include discrete transistor siz- mization techniques to create publicly-available SC libraries ing due to FinFETs, complicated design rules from lithogra- with ASAP7 PDK [10–12]. We first discuss exhaustive tran- phy and restrictive layout space from modern standard cell ar- sistor sizing for cell timing optimization taking advantage of chitectures. The design methodology presented in this paper the discrete transistor sizing for a FinFET-based SC design. enables efficient and high-quality standard cell library design Generalized Euler path theory [13] is adopted to generate and optimization with the ASAP7 PDK. The key techniques high-quality transistor placement results. The generalized include exhaustive transistor sizing for cell timing optimiza- Euler paths lead to a much larger solution space than that of tion, transistor placement with generalized Euler paths and conventional Euler path theory, while accommodating pin ac- back-end design prototyping for library-level explorations.
    [Show full text]
  • Rapid Co-Optimization of Processing and Circuit Design to Overcome
    1082 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 34, NO. 7, JULY 2015 Rapid Co-Optimization of Processing and Circuit Design to Overcome Carbon Nanotube Variations Gage Hills, Jie Zhang, Member, IEEE,MaxMarcelShulaker,HaiWei,Student Member, IEEE, Chi-Shuen Lee, Arjun Balasingam, H.-S. Philip Wong, Fellow, IEEE, and Subhasish Mitra, Fellow, IEEE Abstract—Carbon nanotube field-effect transistors (CNFETs) are promising candidates for building energy-efficient digital systems at highly scaled technology nodes. However, carbon nan- otubes (CNTs) are inherently subject to variations that reduce circuit yield, increase susceptibility to noise, and severely degrade their anticipated energy and speed benefits. Joint exploration and optimization of CNT processing options and CNFET cir- cuit design are required to overcome this outstanding challenge. Fig. 1. (a) CNT. (b) Typical CNFET structure. (c) Scanning electron microscopy image of the CNFET channel. Unfortunately, existing approaches for such exploration and optimization are computationally expensive, and mostly rely on trial-and-error-based ad hoc techniques. In this paper, we present a framework that quickly evaluates the impact of CNT CNFET-based digital systems with massive integration of logic variations on circuit delay and noise margin, and systemati- cally explores the large space of CNT processing options to and memory [42]. derive optimized CNT processing and CNFET circuit design The schematic of a CNFET is shown in Fig. 1. Multiple guidelines.
    [Show full text]
  • Does Logic Locking Work with EDA Tools?
    Does logic locking work with EDA tools? Zhaokun Han Muhammad Yasin Jeyavijayan (JV) Rajendran Texas A&M University Texas A&M University Texas A&M University hzhk0618@tamu.edu myasin@tamu.edu jeyavijayan@tamu.edu Abstract delegates IC fabrication to TSMC or Samsung, and deputes product assembly/test services to Foxconn [2]. Logic locking is a promising solution against emerging In a globalized supply chain, the untrusted entities may hardware security threats, which entails protecting a Boolean obtain a design netlist1, a chip layout, or a manufactured IC. circuit using a “keying” mechanism. The latest and hith- This may lead to threats such as IP piracy, counterfeiting, erto unbroken logic-locking techniques are based on the reverse engineering (REing), overbuilding, and insertion of “corrupt-and-correct (CAC)” principle, offering provable se- hardware Trojans [3]. IP piracy entails malicious entities curity against input-output query attacks. However, it remains illegally using IPs. A foundry can manufacture additional unclear whether these techniques are susceptible to structural ICs to sell at lower profit margins. End-users can conduct attacks. This paper exploits the properties of integrated circuit piracy by REing an IC to extract the design netlist or other (IC) design tools, also termed electronic design automation design/technology secrets. REing of an IC involves peeling off (EDA) tools, to undermine the security of the CAC techniques. the package of an IC, etching the IC layer-by-layer, imaging Our proposed attack can break all the CAC techniques, in- each layer, and stitching the images together to extract the cluding the unbroken CACrem technique that 40+ hackers design netlist.
    [Show full text]
  • FPGA Emulation for Critical-Path Coverage Analysis
    FPGA Emulation for Critical-Path Coverage Analysis by Kyle Balston B.A.Sc., Simon Fraser University, 2010 ATHESISSUBMITTEDINPARTIAL FULFILLMENT OFTHEREQUIREMENTSFORTHEDEGREEOF Master of Applied Science in THEFACULTYOFGRADUATESTUDIES (Electrical and Computer Engineering) The University Of British Columbia (Vancouver) October 2012 © Kyle Balston, 2012 Abstract A major task in post-silicon validation is timing validation: it can be incredibly difficult to ensure a new chip meets timing goals. Post-silicon validation is the first opportunity to check timing with real silicon under actual operating conditions and workloads. However, post-silicon tests suffer from low observability, making it difficult to properly quantify test quality for the long-running random and di- rected system-level tests that are typical in post-silicon. In this thesis, we propose a technique for measuring the quality of long-running system-level tests used for timing coverage through the use of on-chip path monitors to be used with FPGA emulation. We demonstrate our technique on a non-trivial SoC, measuring the cov- erage of 2048 paths (selected as most critical by static timing analysis) achieved by some pre-silicon system-level tests, a number of well-known benchmarks, boot- ing Linux, and executing randomly generated programs. The results show that the technique is feasible, with area and timing overheads acceptable for pre-silicon FPGA emulation. ii Preface Preliminary work related to this thesis has been published in a conference paper and a journal paper. The work itself will be published as an invited conference paper. The first paper, Post-Silicon Code Coverage Evaluation with Reduced Area Overhead for Functional Verification of SoC, was published as [33].
    [Show full text]
  • Exploring Voltage Scaling Techniques in Embedded Processors Hardware Monitors
    Exploring Voltage Scaling Techniques in Embedded Processors Hardware Monitors Arman Pouraghily, Padmaja Duggisetty, Thiago Teixeira VLSI Design Principles Final Report - ECE 658 Fall 2014 Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA, USA Email: fapouraghily,pduggisetty,tteixeirag@umass.edu Abstract—The Internet is a very important communication is taken into account. Our goal to estimate power consumption infrastructure in modern life, with applications varying from overhead – imposed by the introduction of hardware monitor banking transactions, transfer of copyrighted material, compa- – and by applying power saving techniques, minimizing this nies’ assets, to the most simple web surfing. The role of Internet is expected to grow exponentially with cloud computing and overhead is achieved by using voltage scaling techniques. Internet of Things. In this context, the router will continue to The basic idea behind our approach is the simple fact that be the equipment carrying most of the Internet traffic, with the logical complexity of the modules used in the hardware an increasing number of routers’ packet forwarding application monitor is by far less that the ones used in the main CPU being deployed in the form of programmable network processor. Many security algorithms have been developed for the network which means that our hardware monitor could run with much security as a whole in order to increase reliability of com- higher clock rate. However, running the hardware monitor at a munications. The network infrastructure security has received higher frequency than the Network Processor leads to resource less attention, even though attacks in the data plane that can wastage.
    [Show full text]
  • Common Path Pessimism Removal in Static Timing Analysis
    Scholars' Mine Masters Theses Student Theses and Dissertations Summer 2015 Common path pessimism removal in static timing analysis Chunyu Wang Follow this and additional works at: https://scholarsmine.mst.edu/masters_theses Part of the Computer Engineering Commons Department: Recommended Citation Wang, Chunyu, "Common path pessimism removal in static timing analysis" (2015). Masters Theses. 7440. https://scholarsmine.mst.edu/masters_theses/7440 This thesis is brought to you by Scholars' Mine, a service of the Missouri S&T Library and Learning Resources. This work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the permission of the copyright holder. For more information, please contact scholarsmine@mst.edu. COMMON PATH PESSIMISM REMOVAL IN STATIC TIMING ANALYSIS by CHUNYU WANG A THESIS Presented to the Faculty of the Graduate School of the MISSOURI UNIVERSITY OF SCIENCE AND TECHNOLOGY In Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE IN COMPUTER ENGINEERING 2015 Approved by Yiyu Shi, Advisor Minsu Choi Jun Fan 2015 Chunyu Wang All Rights Reserved iii ABSTRACT Static timing analysis is a key process to guarantee timing closure for modern IC designs. However, additional pessimism can significantly increase the difficulty to achieve timing closure. Common path pessimism removal (CPPR) is a prevalent step to achieve accurate timing signoff. To speed up the existing exhaustive exploration on all paths in a design, this thesis introduces a fast multi-threading timing analysis for removing common path pessimism based on block-based static timing analysis. Experimental results show that the proposed method has faster runtime in removing excess pessimism from clock paths.
    [Show full text]