A Stochastic-Computing Based Deep Learning Framework Using

Total Page:16

File Type:pdf, Size:1020Kb

A Stochastic-Computing Based Deep Learning Framework Using A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron Superconducting Technology Ruizhe Cai Olivia Chen Ning Liu Ao Ren Yokohama National University Caiwen Ding Northeastern University Japan Northeastern University USA [email protected] USA {cai.ruiz,ren.ao}@husky.neu.edu {liu.ning,ding.ca}@husky.neu.edu Xuehai Qian Jie Han Wenhui Luo University of Southern California University of Alberta Yokohama National University USA Canada Japan [email protected] [email protected] [email protected] Nobuyuki Yoshikawa Yanzhi Wang Yokohama National University Northeastern University Japan USA [email protected] [email protected] ABSTRACT increases the difficulty to avoid RAW hazards; the second is The Adiabatic Quantum-Flux-Parametron (AQFP) supercon- the unique opportunity of true random number generation ducting technology has been recently developed, which achieves (RNG) using a single AQFP buffer, far more efficient than the highest energy efficiency among superconducting logic RNG in CMOS. We point out that these two characteristics families, potentially 104-105 gain compared with state-of-the- make AQFP especially compatible with the stochastic com- art CMOS. In 2016, the successful fabrication and testing of puting (SC) technique, which uses a time-independent bit AQFP-based circuits with the scale of 83,000 JJs have demon- sequence for value representation, and is compatible with strated the scalability and potential of implementing large- the deep pipelining nature. Further, the application of SC scale systems using AQFP. As a result, it will be promising has been investigated in DNNs in prior work, and the suit- for AQFP in high-performance computing and deep space ability has been illustrated as SC is more compatible with applications, with Deep Neural Network (DNN) inference approximate computations. acceleration as an important example. This work is the first to develop an SC-based DNN accelera- Besides ultra-high energy efficiency, AQFP exhibits two tion framework using AQFP technology. The deep-pipelining arXiv:1907.09077v1 [cs.NE] 22 Jul 2019 unique characteristics: the deep pipelining nature since each nature of AQFP circuits translates into the difficulty in de- AQFP logic gate is connected with an AC clock signal, which signing accumulators/counters in AQFP, which makes the prior design in SC-based DNN not suitable. We overcome Permission to make digital or hard copies of all or part of this work for this limitation taking into account different properties in personal or classroom use is granted without fee provided that copies are not CONV and FC layers: (i) the inner product calculation in FC made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components layers has more number of inputs than that in CONV layers; of this work owned by others than ACM must be honored. Abstracting with (ii) accurate activation function is critical in CONV rather credit is permitted. To copy otherwise, or republish, to post on servers or to than FC layers. Based on these observations, we propose (i) redistribute to lists, requires prior specific permission and/or a fee. Request accurate integration of summation and activation function permissions from [email protected]. in CONV layers using bitonic sorting network and feedback ISCA ’19, June 22–26, 2019, Phoenix, AZ, USA loop, and (ii) low-complexity categorization block for FC lay- © 2019 Association for Computing Machinery. ACM ISBN 978-1-4503-6669-4/19/06...$15.00 ers based on chain of majority gates. For complete design we https://doi.org/10.1145/3307650.3322270 also develop (i) ultra-efficient stochastic number generator a in AQFP, (ii) a high-accuracy sub-sampling (pooling) block Ia (I in ) Ix L L in AQFP, and (iii) majority synthesis for further performance xin x1 x2 xout a improvement and automatic buffer/splitter insertion for re- k1 k2 L L xin xout J1 1 2 J quirement of AQFP circuits. Experimental results suggest buffer 2 “0” “1” Iout that the proposed SC-based DNN using AQFP can achieve Lq up to 6:8 × 104 times higher energy efficiency compared to b kq CMOS-based implementation while maintaining 96% accu- Id racy on the MNIST dataset. b Figure 1: Junction level schematic of an AQFP buffer. KEYWORDS state-of-the-art CMOS (even two order of magnitude energy Stochastic Computing, Deep Learning, Adiabatic Quantum- efficiency gain when cooling energy is accounted for), with Flux-Parametron, Superconducting a clock frequency of several GHz. Recently, the successful fabrication and testing of AQFP-based implementations with ACM Reference Format: Ruizhe Cai, Ao Ren, Olivia Chen, Ning Liu, Caiwen Ding, Xuehai the scale of 83,000 JJs, have demonstrated the scalability and Qian, Jie Han, Wenhui Luo, Nobuyuki Yoshikawa, and Yanzhi Wang. potential of implementing large-scale circuits/systems using 2019. A Stochastic-Computing based Deep Learning Framework AQFP [31]. As a result, it will be promising for the AQFP tech- using Adiabatic Quantum-Flux-Parametron Superconducting Tech- nology in high-performance computing, with DNN inference nology. In The 46th Annual International Symposium on Computer Ar- acceleration an important example. chitecture (ISCA ’19), June 22–26, 2019, Phoenix, AZ, USA. ACM, New The AQFP technology uses AC bias/excitation currents as York, NY, USA, 12 pages. https://doi.org/10.1145/3307650.3322270 both multi-phase clock signal and power supply [43] to miti- gate the power consumption overhead of DC bias in other 1 INTRODUCTION superconducting logic technologies. Besides the ultra-high Wide-ranging applications of deep neural networks (DNNs) energy efficiency, AQFP exhibits two unique characteristics. in image classification, computer vision, autonomous driving, The first is the deep pipelining nature since each AQFP logic embedded and IoT systems, etc., call for high-performance gate is connected with an AC clock signal and occupies one and energy-efficient implementation of the inference phase clock phase, which increases the difficulty to avoid RAW of DNNs. To simultaneously achieve high performance and (Read after Write) hazards in conventional binary comput- energy efficiency, hardware acceleration of DNNs, including ing. The second is the unique opportunity of true random FPGA- and ASIC-based implementations, has been exten- number generation (RNG) using a single AQFP buffer (dou- sively investigated [4, 6–9, 12, 13, 17–20, 28, 29, 33, 34, 37, 39– ble JJ) in AQFP [16, 44], which is far more efficient than RNG 41, 48, 49, 51, 53–55]. However, most of these designs are in CMOS. CMOS based, and suffer from a performance limitation be- We point out that these two characteristics make AQFP cause the Moore’s Law is reaching its end. technology especially compatible with the stochastic comput- Being widely-known for low energy dissipation and ultra- ing (SC) technique [14][1], which allows the implementation fast switching speed, Josephson Junction (JJ) based supercon- of 64 basic operations using extremely small hardware foot- ductor logic families have been proposed and implemented print. SC uses a time-independent bit sequence for value to process analog and digital signals [24]. It has been per- representation, and lacks RAW dependency among the bits ceived to be an important candidate to replace state-of-the- in the stream. As a result it is compatible with the deep- art CMOS due to the superior potential in operation speed pipelining nature of superconducting logic. Furthermore, and energy efficiency, as recognized by the U.S. IARPA C3 one important limiting factor of SC, i.e., the overhead of and SuperTools Programs and Japan MEXT-JSPS Project. Adi- RNG [27][32], can be largely mitigated in AQFP. abatic quantum-flux-parametron (AQFP) logic is an energy- SC is inherently an approximate computing technique efficient superconductor logic family based on the quantum- [15][52], and there have been disputes about the suitabil- flux-parametron (QFP)[26]. AQFP logic achieves high energy ity of SC for precise computing applications [3]. On the efficiency by adopting adiabatic switching [23], in which the other hand, the DNN inference engine is essentially an ap- potential energy profile evolves from a single well to a double proximate computing application. This is because the final well so that the logic state can change quasi-statically. The classification result depends on the relative score/logit val- energy-delay-product (EDP) of the AQFP circuits fabricated ues of different classes, instead of absolute values. Recent using processes such as the AIST standard process 2 (STP2) work [35][3] have pointed out the suitability of SC for DNN [30] and the MIT-LL SFQ process [47], is only three orders of acceleration, and [50] has further proved the equivalence magnitude larger than the quantum limit [45]. It can poten- between SC-based DNN and binary neural networks, where tially achieve 104-105 energy efficiency gain compared with 2 a b c a b c a b c a the latter originate from deep learning society [11]. All the MAJ AND NOR SPLITTER buffer buffer buffer buffer ‘0' buffer inv ‘1' inv buffer above discussions suggest the potential to build SC-based xin xout xin xout xin xout xin xout DNN acceleration using AQFP technology. d d d b c d (a) (b) (c) (d) This paper is the first to develop an SC-based DNN accel- Figure 2: Example of AQFP logic gates. eration framework using AQFP superconducting technology. Data input We adopt bipolar format in SC because weights and inputs AQFP Clock_in_phase 1 logic block can be positive or negative, and build stochastic number gen- Data input Clock_in_phase 2 eration block in AQFP with ultra-high efficiency. The deep- Phase 1 (clock in) Clock_in_phase 3 Phase 1 (data out) pipelining nature of AQFP circuits translates into the diffi- Clock_in_phase 4 Phase 2 (clock in) culty in designing accumulators/counters in AQFP, which Clock_in_phase 1 Phase 2 (data out) Phase 3 (clock in) Clock_in_phase 2 makes the prior design in SC-based DNN [35] not suitable.
Recommended publications
  • Magnonic Logic Circuits
    IOP PUBLISHING JOURNAL OF PHYSICS D: APPLIED PHYSICS J. Phys. D: Appl. Phys. 43 (2010) 264005 (10pp) doi:10.1088/0022-3727/43/26/264005 Magnonic logic circuits Alexander Khitun, Mingqiang Bao and Kang L Wang Device Research Laboratory, Electrical Engineering Department, Focus Center on Functional Engineered Nano Architectonics (FENA), Western Institute of Nanoelectronics (WIN), University of California at Los Angeles, Los Angeles, California, 90095-1594, USA Received 23 November 2009, in final form 31 March 2010 Published 17 June 2010 Online at stacks.iop.org/JPhysD/43/264005 Abstract We describe and analyse possible approaches to magnonic logic circuits and basic elements required for circuit construction. A distinctive feature of the magnonic circuitry is that information is transmitted by spin waves propagating in the magnetic waveguides without the use of electric current. The latter makes it possible to exploit spin wave phenomena for more efficient data transfer and enhanced logic functionality. We describe possible schemes for general computing and special task data processing. The functional throughput of the magnonic logic gates is estimated and compared with the conventional transistor-based approach. Magnonic logic circuits allow scaling down to the deep submicrometre range and THz frequency operation. The scaling is in favour of the magnonic circuits offering a significant functional advantage over the traditional approach. The disadvantages and problems of the spin wave devices are also discussed. 1. Introduction interest in spin waves as a potential candidate for information transmission. The situation has changed drastically as the There is an immense practical need for novel logic devices characteristic distance between the devices on the chip entered capable of overcoming the constraints inherent to conventional the deep-submicrometre range.
    [Show full text]
  • ERSFQ 8-Bit Parallel Arithmetic Logic Unit
    1 ERSFQ 8-bit Parallel Arithmetic Logic Unit A. F. Kirichenko, I. V. Vernik, M. Y. Kamkar, J. Walter, M. Miller, L. R. Albu, and O. A. Mukhanov Abstract— We have designed and tested a parallel 8-bit ERSFQ To date, the reported superconductor ALU designs were arithmetic logic unit (ALU). The ALU design employs wave- implemented using RSFQ logic following bit-serial, bit-slice, pipelined instruction execution and features modular bit-slice ar- chitecture that is easily extendable to any number of bits and and parallel architectures. adaptable to current recycling. A carry signal synchronized with The bit-serial designs have the lowest complexity; however, an asynchronous instruction propagation provides the wave- their latencies increase linearly with the operand lengths, hard- pipeline operation of the ALU. The ALU instruction set consists of ly making them competitive for implementation in 32-/64-bit 14 arithmetical and logical instructions. It has been designed and processors [17], [18]. Bit-serial ALUs were used in 8-bit simulated for operation up to a 10 GHz clock rate at the 10-kA/cm2 fabrication process. The ALU is embedded into a shift-register- RSFQ microprocessors [19]-[24], in which an 8 times faster based high-frequency testbed with on-chip clock generator to allow internal clock is still feasible. As an example, an 80 GHz bit- for comprehensive high frequency testing for all possible operands. serial ALU was reported in [25]. The 8-bit ERSFQ ALU, comprising 6840 Josephson junctions, has In order to alleviate the high-clock requirements and long 2 been fabricated with MIT Lincoln Lab’s 10-kA/cm SFQ5ee fabri- latencies of bit-serial design while keeping moderate hardware cation process featuring eight Nb wiring layers and a high-kinetic inductance layer needed for ERSFQ technology.
    [Show full text]
  • Japan's ERATO and PRESTO Basic Research Programs
    Japanese Technology Evaluation Center JTEC JTEC Panel Report on Japan’s ERATO and PRESTO Basic Research Programs George Gamota (Panel Chair) William E. Bentley Rita R. Colwell Paul J. Herer David Kahaner Tamami Kusuda Jay Lee John M. Rowell Leo Young September 1996 International Technology Research Institute R.D. Shelton, Director Geoffrey M. Holdridge, WTEC Director Loyola College in Maryland 4501 North Charles Street Baltimore, Maryland 21210-2699 JTEC PANEL ON JAPAN’S ERATO AND PRESTO PROGRAMS Sponsored by the National Science Foundation and the Department of Commerce of the United States Government George Gamota (Panel Chair) David K. Kahaner Science & Technology Management Associates Asian Technology Information Program 17 Solomon Pierce Road 6 15 21 Roppongi, Harks Roppongi Bldg. 1F Lexington, MA 02173 Minato ku, Tokyo 106 Japan William E. Bentley Tamami Kusuda University of Maryland 5000 Battery Ln., Apt. #506 Dept. of Chemical Engineering Bethesda, MD 20814 College Park, MD 20742 Jay Lee Rita R. Colwell National Science Foundation University of Maryland 4201 Wilson Blvd., Rm. 585 Biotechnology Institute Arlington, VA 22230 College Park, MD 20740 John Rowell Paul J. Herer 102 Exeter Dr. National Science Foundation Berkeley Heights, NJ 07922 4201 Wilson Blvd., Rm. 505 Arlington, VA 22230 Leo Young 6407 Maiden Lane Bethesda, MD 20817 INTERNATIONAL TECHNOLOGY RESEARCH INSTITUTE WTEC PROGRAM The World Technology Evaluation Center (WTEC) at Loyola College (previously known as the Japanese Technology Evaluation Center, JTEC) provides assessments of foreign research and development in selected technologies under a cooperative agreement with the National Science Foundation (NSF). Loyola's International Technology Research Institute (ITRI), R.D.
    [Show full text]
  • A Stochastic-Computing Based Deep Learning Framework Using
    A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron Superconducting Technology Ruizhe Cai Olivia Chen Ning Liu Ao Ren Yokohama National University Caiwen Ding Northeastern University Japan Northeastern University USA [email protected] USA {cai.ruiz,ren.ao}@husky.neu.edu {liu.ning,ding.ca}@husky.neu.edu Xuehai Qian Jie Han Wenhui Luo University of Southern California University of Alberta Yokohama National University USA Canada Japan [email protected] [email protected] [email protected] Nobuyuki Yoshikawa Yanzhi Wang Yokohama National University Northeastern University Japan USA [email protected] [email protected] ABSTRACT increases the difficulty to avoid RAW hazards; the second is The Adiabatic Quantum-Flux-Parametron (AQFP) supercon- the unique opportunity of true random number generation ducting technology has been recently developed, which achieves (RNG) using a single AQFP buffer, far more efficient than the highest energy efficiency among superconducting logic RNG in CMOS. We point out that these two characteristics families, potentially 104-105 gain compared with state-of-the- make AQFP especially compatible with the stochastic com- art CMOS. In 2016, the successful fabrication and testing of puting (SC) technique, which uses a time-independent bit AQFP-based circuits with the scale of 83,000 JJs have demon- sequence for value representation, and is compatible with strated the scalability and potential of implementing large- the deep pipelining nature. Further, the application of SC scale systems using AQFP. As a result, it will be promising has been investigated in DNNs in prior work, and the suit- for AQFP in high-performance computing and deep space ability has been illustrated as SC is more compatible with applications, with Deep Neural Network (DNN) inference approximate computations.
    [Show full text]
  • Nonconventional Computer Arithmetic Circuits, Systems and Applications Leonel Sousa, Senior Member, IEEE
    1 Nonconventional Computer Arithmetic Circuits, Systems and Applications Leonel Sousa, Senior Member, IEEE Abstract—Arithmetic plays a major role in a computer’s basic levels and leads to high power consumption. Hence, performance and efficiency. Building new computing platforms the research on unconventional number systems is of the supported by the traditional binary arithmetic and silicon-based utmost interest to explore parallelism and take advantage of technologies to meet the requirements of today’s applications is becoming increasingly more challenging, regardless whether we the characteristics of emerging technologies to improve both consider embedded devices or high-performance computers. As a the performance and the energy efficiency of computational result, a significant amount of research effort has been devoted to systems. Moreover, by avoiding the dependencies of binary the study of nonconventional number systems to investigate more systems, nonconventional number systems can also support efficient arithmetic circuits and improved computer technologies the design of reliable computing systems using the newest to facilitate the development of computational units that can meet the requirements of applications in emergent domains. available technologies, such as nanotechnologies. This paper presents an overview of the state of the art in non- conventional computer arithmetic. Several different alternative computing models and emerging technologies are analyzed, such A. Motivation as nanotechnologies, superconductor devices, and biological- and quantum-based computing, and their applications to multiple The Complementary Metal-Oxide Semiconductor (CMOS) domains are discussed. A comprehensive approach is followed transistor was invented over fifty years ago and has played in a survey of the logarithmic and residue number systems, a key role in the development of modern electronic devices the hyperdimensional and stochastic computation models, and and all that it has enabled.
    [Show full text]
  • Computer Aided Systems Theory – EUROCAST 2019
    Remarks on the Design of First Digital Computers in Japan - Contributions of Yasuo Komamiya B Radomir S. Stankovi´c1( ), Tsutomu Sasao2, Jaakko T. Astola3, and Akihiko Yamada4 1 Mathematical Institute of SASA, Belgrade, Serbia [email protected] 2 Department of Computer Science, Meiji University, Kawasaki, Kanagawa 214-8571, Japan 3 Department of Signal Processing, Tampere University of Technology, Tampere, Finland 4 Computer Systems and Media Laboratory, Tokyo, Japan Abstract. This paper presents some less known details about the work of Yasuo Komamiya in development of the first relay computers using the theory of computing networks that is based on the former work of Oohashi Kan-ichi and Mochiori Goto at the Electrotechnical Laboratory (ETL) of Agency of Industrial Science and Technology, Tokyo, Japan. The work at ETL in the same direction was performed under guidance of Mochinori Goto. Keywords: Digital computers · Relay-based computers · Parametron computers · Transistorised computers · History · Arithmetic circuits 1 Introduction In the first half of the 20th century, many useful algorithms were developed to solve various problems in different areas of human activity. However, most of them require intensive computations, due to which their applications, espe- cially wide applications, have been suppressed by the lack of the correspond- ing computing devices. Even before that, already in late thirties, it was clear that discrete and digital devices are more appropriate for such applications that require complex computations. Therefore, in fifties of the 20th century, the work towards development of digital computers was a central subject of research at many important national level institutions. This research was performed equally in all technology leading countries all over the world, notably USA, Europe, and Japan.
    [Show full text]
  • Theory, Synthesis, and Application of Adiabatic and Reversible Logic
    University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 11-23-2013 Theory, Synthesis, and Application of Adiabatic and Reversible Logic Circuits For Security Applications Matthew Arthur Morrison University of South Florida, [email protected] Follow this and additional works at: https://scholarcommons.usf.edu/etd Part of the Computer Engineering Commons Scholar Commons Citation Morrison, Matthew Arthur, "Theory, Synthesis, and Application of Adiabatic and Reversible Logic Circuits For Security Applications" (2013). Graduate Theses and Dissertations. https://scholarcommons.usf.edu/etd/5082 This Dissertation is brought to you for free and open access by the Graduate School at Scholar Commons. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Scholar Commons. For more information, please contact [email protected]. Theory, Synthesis, and Application of Adiabatic and Reversible Logic Circuits For Security Applications by Matthew A. Morrison A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer Science and Engineering College of Engineering University of South Florida Major Professor: Nagarajan Ranganathan, Ph.D. Sanjukta Bhanja, Ph.D. Srinivas Katkoori, Ph.D. Jay Ligatti, Ph.D. Kandethody Ramachandran, Ph.D. Hao Zheng, Ph.D. Date of Approval: November 22, 2013 Keywords: Charge Based Computing, DPA Attacks, Encryption, Memory, Power Copyright © 2014, Matthew A. Morrison DEDICATION To my parents, Alfred and Kathleen Morrison, and to my grandparents, Arthur and Betty Kempf, and Alfred and Dorothy Morrison, for making all the opportunities I have possible. ACKNOWLEDGMENTS I would like to thank my advisor, Dr.
    [Show full text]
  • All-NDR Crossbar Logic Dmitri B
    All-NDR Crossbar Logic Dmitri B. Strukov, Member, IEEE, and Konstantin K. Likharev, Fellow, IEEE Abstract—We propose new crossbar circuits in which the hybrid CMOS/nanocrossbar circuits do, resulting in a logic functionality, signal restoration, and connectivity are all significant footprint overhead – see, e.g., reviews [8-10]. performed by similar bistable two-terminal devices with For that, however, the gate insulation problem should be negative differential resistance (NDR) in one of the states. The solved. Namely, in all earlier NDR logic circuits we are gate isolation challenge is met by using device’s nonlinearity aware of, Goto pair connection has been performed using together with a multiphase clocking scheme. A preliminary evaluation shows that for at least some applications, the all- some other two-terminal devices – see, e.g., Refs. 11, 12. In NDR approach enables circuit density and data throughput simple (two-wire-layer, uniform) crossbars, this is not an higher than those of hybrid CMOL FPGA ICs. option. In this paper, we propose a solution of the gate isolation problem, based on a multiphase clocking scheme and a specifically engineered device nonlinearity. I. INTRODUCTION ttempts to utilize the negative differential resistance II. ALL-NDR LOGIC CIRCUITS A (NDR) effect in two-terminal devices (based, e.g., on Figure 1a shows an example of the device capable of band-to-band tunneling in Esaki diodes [1] or on resonant performing all the functions we need. Its stack consists of tunneling through quantum wells [2]) for computing have a two back-to-back Esaki diodes (which ensure a symmetric long history.
    [Show full text]
  • Some Key Issues in Microelectronic Packaging
    G. V. CLATTERBAUGH, P. VICHOT, AND H. K. CHARLES, JR. Some Key Issues in Microelectronic Packaging Guy V. Clatterbaugh, Paul Vichot, and Harry K. Charles, Jr. Military and space electronics are tending toward increased system perfor- mance, i.e., higher speed, higher circuit density, and higher functionality. Recent reductions in government spending on space and military hardware have also made cost reduction a key consideration. As electronics approach physical size and performance limits, practical considerations such as wireability, thermal management, electromagnetic compatibility, and system reliability become dominant issues in system design. Resolving such issues requires the use of sophisticated analysis and computational methods. (Keywords: Electronic packaging, Multiconductor transmission line analysis, Printed circuit board thermal analysis, Wireability.) INTRODUCTION In recent years, the electronics industry has discov- functionality and performance while reducing volume ered that the major economic advances made in high- and weight. performance electronic circuitry have come with in- The quest to achieve better performance (higher creased integration. The industry is rapidly converging speed and integration) has placed pressure on manu- toward true wafer-scale integration, i.e., toward an facturers and has forced integrated circuits (ICs) closer entire system fabricated on one silicon substrate. Every together. High-speed computer systems require that 5 years or so, we see wafer foundries processing larger the central processing unit and the memory and con- silicon wafers with smaller line geometries. Today, 12- trollers be proximal to minimize interconnection de- in. wafers are being processed with 0.35-mm lines. By lays. The increased functionality of these chips has the year 2010, 16-in.
    [Show full text]
  • Superconducting Nanowire Electronics for Alternative Computing
    Superconducting nanowire electronics for alternative computing by Emily Toomey Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY May 2020 © Massachusetts Institute of Technology 2020. All rights reserved. Author................................................................ Department of Electrical Engineering and Computer Science May 14, 2020 Certified by. Karl K. Berggren Professor of Electrical Engineering Thesis Supervisor Accepted by........................................................... Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students 2 Superconducting nanowire electronics for alternative computing by Emily Toomey Submitted to the Department of Electrical Engineering and Computer Science on May 14, 2020, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering Abstract With traditional computing systems struggling to meet the demands of modern tech- nology, new approaches to both hardware and architecture are becoming increasingly critical. In this work, I develop the foundation of a power-efficient alternative com- puting system using superconducting nanowires. Although traditionally operated as single photon detectors, superconducting nanowires host a suite of attractive charac- teristics that have recently inspired their
    [Show full text]
  • Adiabatic Quantum-Flux-Parametron with Delay-Line Clocking: Logic Gate Demonstration and Phase Skipping Operation
    Adiabatic quantum-flux-parametron with delay-line clocking: logic gate demonstration and phase skipping operation Taiki Yamae,1,2 Naoki Takeuchi,3,4,* and Nobuyuki Yoshikawa1,4 1 Department of Electrical and Computer Engineering, Yokohama National University, 79-5 Tokiwadai, Hodogaya, Yokohama 240-8501, Japan 2 Research Fellow of Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda, Tokyo 102-0083, Japan 3 Research Center for Emerging Computing Technologies, National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono, Tsukuba 305-8568, Japan 4 Institute of Advanced Sciences, Yokohama National University, 79-5 Tokiwadai, Hodogaya, Yokohama 240-8501, Japan * [email protected] Abstract. Adiabatic quantum-flux-parametron (AQFP) logic is an energy-efficient superconductor logic family. The latency of AQFP circuits is relatively long compared to that of other superconductor logic families and thus such circuits require low-latency clocking schemes. In a previous study, we proposed a low-latency clocking scheme called delay-line clocking, in which the latency for each logic operation is determined by the propagation delay of the excitation current, and demonstrated a simple AQFP buffer chain that adopts delay-line clocking. However, it is unclear whether more complex AQFP circuits can adopt delay-line clocking. In the present study, we demonstrate AQFP logic gates (AND and XOR gates) that use delay-line clocking as a step towards implementing large-scale AQFP circuits with delay-line clocking. 1 AND and XOR gates with a latency of approximately 20 ps per gate are shown to operate at up to 5 and 4 GHz, respectively, in experiments.
    [Show full text]
  • Single-Electron Logic and Memory Devices
    INT. J. ELECTRONICS, 1999, VOL. 86, NO. 5, 511± 547 Single-electron logic and memory devices ALEXANDER N. KOROTKOV² ³ Single-electronics is believed to be the leading candidate for future digital electronics which will be able to operate at 10 nm size scale and below. How- ever, the problems of integrated single-electronics are quite serious whereby the future prospects are still uncertain. In this paper we discuss the operation principles and required parameters of several proposed families of single-electron logic, including the logic based on single-electron transistors, wireless single-electron logic and single-electron parametron. We also brie¯ y discuss the single-electron memory which is easier to implement than logic and, hence, is more important from the practical point of view. As an example, we consider the background- charge-insensitive hybrid SET/FET memory. 1. Introduction More than 10 years has past since the beginning of the active theoretical and experimental study of correlated single-electron tunnelling (for reviews see, e.g. [1± 6]). This is already a su ciently long period of time to ask a question if the applied single-electronics is only a dream or we can expect the creation of really useful single- electron devices in the not too distant future. However, there is still no simple answer to this question. On the one hand, the practical value of several non-integrated application s have been already proven. On the other hand, for the integrated digital single-electron devices which are the most important potential application of single- electronics, the prospects are still not so clear.
    [Show full text]