Digital Electronics and Analog Photonics for Convolutional Neural Networks (DEAP-Cnns)

Total Page:16

File Type:pdf, Size:1020Kb

Digital Electronics and Analog Photonics for Convolutional Neural Networks (DEAP-Cnns) Digital Electronics and Analog Photonics for Convolutional Neural Networks (DEAP-CNNs) Viraj Bangari,∗ Bicky A. Marquez, Heidi B. Miller, and Bhavin J. Shastriy Department of Physics, Engineering Physics & Astronomy, Queen's University, Kingston, ON KL7 3N6, Canada Alexander N. Tait National Institute of Standards and Technology (NIST), Boulder, Colorado 80305, USA Mitchell A. Nahmias, Thomas Ferreira de Lima, Hsuan-Tung Peng, and Paul R. Prucnal Department of Electrical Engineering, Princeton University, Princeton, NJ 08544, USA (Dated: July 3, 2019) Convolutional Neural Networks (CNNs) are powerful and highly ubiquitous tools for extracting features from large datasets for applications such as computer vision and natural language pro- cessing. However, a convolution is a computationally expensive operation in digital electronics. In contrast, neuromorphic photonic systems, which have experienced a recent surge of interest over the last few years, propose higher bandwidth and energy efficiencies for neural network training and inference. Neuromorphic photonics exploits the advantages of optical electronics, including the ease of analog processing, and busing multiple signals on a single waveguide at the speed of light. Here, we propose a Digital Electronic and Analog Photonic (DEAP) CNN hardware architecture that has potential to be 2.8 to 14 times faster while maintaining the same power usage of current state-of-the-art GPUs. I. INTRODUCTION predefined random weights of their hidden layers cannot be modified [8]. The success of CNNs for large-scale image recognition An alternative approach uses silicon photonics to de- has stimulated research in developing faster and more sign fully programmable neural networks [15], using a so- accurate algorithms for their use. However, CNNs are called broadcast-and-weight protocol [10{12]. This pro- computationally intensive and therefore results in long tocol is capable of implementing reconfigurable, recurrent processing latency. One of the primary bottlenecks is and feedforward neural network models, using a bank of computing the matrix multiplication required for forward tunable silicon microring resonators (MRRs) that recre- propagation. In fact, over 80% of the total processing ate on-chip synaptic weights. Therefore, such a protocol time is spent on the convolution [1]. Therefore, tech- allows it to emulate physical neurons. Mach-Zehnder in- niques that improve the efficiency of even forward-only terferometers have been also used to model synaptic-like propagation are in high demand and researched exten- connections of physical neurons [14]. The advantage of sively [2, 3]. the former approach over the latter is that it has already In this work, we present a complete digital electronic demonstrated fan-in, inhibition, time-resolved process- and analog photonic (DEAP) architecture capable of per- ing, and autaptic cascadability [12]. The DEAP CNN de- forming highly efficient CNNs for image recognition. The sign is therefore compatible with mainstream silicon pho- competitive MNIST handwriting dataset[4] is used as a tonic device platforms. This approach leverages the ad- benchmark test for our DEAP CNN. At first, we train a vances in silicon photonics that have recently progressed standard two-layer CNN offline, after which network pa- to the level of sophistication required for large-scale inte- rameters are uploaded to the DEAP CNN. Our scope is gration. Furthermore, this proposed architecture allows arXiv:1907.01525v1 [eess.SP] 23 Apr 2019 limited to the forward propagation, but includes power the implementation of multi-layer networks to implement and speed analyses of our proposed architecture. the deep learning framework. Due to their speed and energy efficiency, photonic neu- Inspired by the work of Mehrabian et al. [16], which ral networks have been widely investigated from different lays out a potential architecture for photonic CNNs with approaches that can be grouped into three categories: DRAM, buffers, and microring resonators, our design (1) reservoir computing [5{8]; reconfigurable architec- goes a step further by considering specific input repre- tures based on (2) ring-resonators [9{12], and (3) Mach- sentation, as well as an example of how an algorithm for Zehnder interferometers [13, 14]. Reservoir computing in tasks such as MNIST handwritten digit recognition can the discrete photonic domain successfully implement neu- be mapped to photonics. Moreover, we consider summa- ral networks for fast information processing, however the tion of multi-channel inputs, multi-dimensional kernels, the limitations on weights being between 0 and 1, and the architecture for the depth of kernel or inputs. This work is divided in five sections: Following this ∗ [email protected] introduction, in section (II), we describe convolutions as y [email protected] used in the field of signal processing. Then, we intro- 2 duce silicon photonic devices to perform convolutions in image kernel output photonics. Section (III) introduces a hardware inspired o A1,1A1,2A1,3 F1,1 F1,2 o 1 2 R o algorithm to perform such full photonic convolutions. In H A1,4 A1,5 A1,6 F1,3 F1,4 o 3 4 D R Section (IV), we utilize our previously described architec- A1,7 A1,8 A1,9 D { ture to build a two-layers DEAP CNN for MNIST hand- W F3,4 F3,1F3,2 written digit recognition. Finally, in section (V), we show F { { 3,3 F3,3 F3,4 an energy-speed benchmark test, where we compare the F3,2 F2,1 F2,2 performance of DEAP with the empirical dataset Deep- A3,1A3,2A3,3 A2,1A2,2A2,3 A1,1A1,2A1,3 F3,1 3,4 3,5 2,4 2,5 1,4 1,5 F2,3 F2,4 Bench [17]. Note, we have made the high level simulator A A A3,6 A A A2,6 A A A1,6 F2,4 A3,7 A3,8 A3,9 A2,7 A2,8 A2,9 A1,7 A1,8 A1,9 and mapping tool for the DEAP architecture publicly F1,1 F1,2 F2,3 2 available [18]. F1,3 F1,4 F2,2 DR A { F2,1 AA3,9 AA3,8 AA3,6AA3,5 AA2,9 AA2,8 AA2,6AA2,5 AA1,9 AA1,8 AA1,6AA1,5 F1,4 o4 II. CONVOLUTIONS AND PHOTONICS A3,8A3,7A3,5A3,4A2,8A2,7A2,5A2,4A1,8A1,7A1,5A1,4 F1,3 o3 = o A3,6 A3,5A3,3 A3,2A2,6 A2,5A2,3 A2,2A1,6 A1,5A1,3 A1,2 * F1,2 2 II.1. Convolutions Background A3,5A3,4A3,2A3,1A2,5A2,4A2,2A2,1A1,5A1,4A1,2A1,1 F1,1 o 1 DR 2 F o A convolution of two discrete domain functions f and g is defined by: Figure 1. Schematic illustration of a convolution. At the top of the figure, an input image is represented as a matrix of 1 X numbers with dimensionality H × W × D where H, W and (f ∗ g)[t] = f[τ]g[t − τ]; (1) D are the height, width and depth of the image, respectively. t=−∞ Each element Ai;j of A represents the intensity of a pixel at where (f ∗g) represents a weighted average of the function that particular spatial location. The kernel F is a matrix with f[τ] when it is weighting by g[−τ] shifted by t. The dimensionality R × R × D, where each element Fi;j is defined as a real number. The kernel is slid over the image by using a weighting function g[−τ] emphasizes different parts of stride S equal to one. As the image has multiple channels (or the input function f[τ] as t changes. depth) D, the same kernel is applied to each channel. Assum- In digital image processing, a similar process is fol- ing H = W , the overall output dimensionality is (H −R+1)2. lowed. The convolution of an image A with a kernel F The bottom of the figure shows how a convolution operation produces a convolved image O. An image is represented generalized into a single matrix-matrix multiplication. where as a matrix of numbers with dimensionality H×W , where the kernel F is transformed into a vector F with DR2 ele- H and W are the height and width of the image, respec- ments, and the image A is transformed into a matrix A of 2 2 tively. Each element of a matrix represents the intensity dimensionality DR × (H − R + 1) . Therefore, the output is of a pixel at that particular spatial location. A kernel represented by a vector with (H − R + 1) elements. is a matrix of real numbers with dimensionality R × R. The value of a particular convolved pixel is defined by: Table I. Summary of Convolutional Parameters R R X X Oi;j = Fk;lAi+k;j+l: (2) Parameter Meaning k=1 l=1 N Number of input images H Height of input image including padding Using matrix slicing notation, Eq. (2) can be represented W Width of input image including padding as a dot product of two vectorized matrices: D Number of input channels O = vec(F )T · vec((A )m2[i;i+R])T : (3) R Edge length of kernel i;j m;n n2[j;j+R] K Number of kernels A convolution reduces the dimensionality of the input S Stride image to (H − R + 1) × (W − R + 1), so a padding of zero values is normally applied around the edges of the input image to counteract this. A schematic illustration parameter S is referred to as the \stride" of the convolu- of a convolution in digital image processing is shown at tion. This convolution is similar to Eq. (3), except that the top of Fig. 1. the outputs from each channel are summed together in When convolutions are used to perform parallel ma- the end, and that the stride parameter is always equal to trix multiplications in neural networks such as CNNs, a 1 in image processing.
Recommended publications
  • Sirius Federal ITES-3H CLIN 0301 Catalog
    W52P1J‐16‐D‐0018 0301 Catalog ‐ Sirius Federal, LLC Catalog Catalog Vendor Part Manufacturer Manufacturer Part Number Description List Price ITES‐3H Sell Price % Off List Price UNSPSC Comment Standards Number CLIN Number Win 10 Pro 64 English French Spanish 3 0301 31209 Dell 619‐AHKN (Part of Dell configuration) (Part of $0.00 $0.00 20% 43.21.15.15 Bundle) HP 5y NBD Onsite/Disk Retention NB 3 0302 402258 Hewlett Packard (HPI) UE337E SVC , Notebook and Tablet PC (3/3/0 $345.00 $276.00 20% 81.11.18.11.00 warranty), Hardware Support during 3 0301 402259 Hewlett Packard (HPI) H1L08AA#ABA HP 3005pr USB3.0Port Replicator $260.00 $208.00 20% 43.21.16.03.00 OptiPlex 3020 Small Form Factor 3 0301 402260 Dell 210‐ABIX $700.00 $560.00 20% 43.21.15.07.00 (CTO) Wyse 7030 PCoIP zero client, 4xDP, 3 0301 402438 Dell 210‐AKBO Fiber Ready, with mouse and power $829.00 $663.20 20% 43.21.15.06.00 cord, keyboard sent separately 3 0301 402693 Hewlett Packard (HPI) D9Y32AA#ABA HP UltraSlim Dock 2013 $300.00 $240.00 20% 43.21.16.02.00 3 0301 403917 DBL ‐ Belkin Components B2B061 Belkin Wired USB Standard Keyboard $15.00 $12.00 20% 43.21.17.06.00 3 0301 403918 Dell 210‐ACYX Dell Precision 7910 XCTO Base $4,203.00 $3,362.40 20% 43.21.15.02.00 Dual Intel Xeon Processor E5‐2609 v3 3 0301 403919 Dell 338‐BFJJ $0.00 $0.00 20% 43.20.15.03.00 (6C,1.9GHz, 15M, 85W) Dual Intel Xeon Processor E5‐2609 v3 3 0301 403920 Dell 412‐AAED $0.00 $0.00 20% 43.20.15.03.00 (6C,1.9GHz, 15M, 85W) 16GB (4x4GB) 2133MHz DDR4 3 0301 403922 Dell 370‐ABUO $0.00 $0.00 20% 32.10.16.21.00 RDIMM
    [Show full text]
  • HP Grafične Postaje: HP Z1, HP Z2, HP Z4, ……
    ARES RAČUNALNIŠTVO d.o.o. HP Z1 Tržaška cesta 330 HP Z2 1000 Ljubljana, Slovenia, EU HP Z4 tel: 01-256 21 50 Grafične kartice GSM: 041-662 508 e-mail: [email protected] www.ares-rac.si 29.09.2021 DDV = 22% 1,22 Grafične postaje HP Grafične postaje: HP Z1, HP Z2, HP Z4, …….. ZALOGA Nudimo vam tudi druge modele, ki še niso v ceniku preveri zalogo Na koncu cenika so tudi opcije: Grafične kartice. Ostale opcije po ponudbi. Objavljamo neto ceno in ceno z DDV (PPC in akcijsko). Dokončna cena in dobavni rok - po konkretni ponudbi Objavljamo neto ceno in ceno z DDV (PPC in akcijsko). Dokončna cena po ponudbi. Koda HP Z1 TWR Grafična postaja /Delovna postaja Zaloga Neto cena Cena z DDV (EUR) (EUR) HP Z1 G8 TWR Zaloga Neto cena Cena z DDV 29Y17AV HP Z1 G8 TWR, Core i5-11500, 16GB, SSD 512GB, nVidia GeForce RTX 3070 8GB, USB-C, Win10Pro #71595950 Koda: 29Y17AV#71595950 HP delovna postaja, HP grafična postaja, Procesor Intel Core i5 -11500 (2,7 - 4,6 GHz) 12MB 6 jedr/12 niti Nabor vezja Intel Q570 Pomnilnik 16 GB DDR4 3200 MHz (1x16GB), 3x prosta reža, do 128 GB 1.735,18 2.116,92 SSD pogon 512 GB M.2 2280 PCIe NVMe TLC Optična enota: brez, HDD pogon : brez Razširitvena mesta 2x 3,5'', 1x 2,5'', RAID podpira RAID AKCIJA 1.657,00 2.021,54 Grafična kartica: nVidia GeForce RTX 3070 8GB GDDR6, 256bit, 5888 Cuda jeder CENA Žične povezave: Intel I219-LM Gigabit Network Brezžične povezave: brez Razširitve: 1x M.2 2230; 1x PCIe 3.0 x16; 2x PCIe 3,0 x16 (ožičena kot x4); 2x M.2 2230/2280; 2x PCIe 3.0 x1 Čitalec kartic: brez Priključki spredaj: 1x USB-C, 2x
    [Show full text]
  • Workstation Fisse E Mobili Monitor Hub Usb-C Docking
    OCCUPATEVI DEL VOSTRO BUSINESS, NOI CI PRENDEREMO CURA DEI VOSTRI COMPUTERS E PROGETTI INFORMATICI 16-09-21 16 WORKSTATION FISSE E MOBILI MONITOR HUB USB-C DOCKING STATION Noleggiare computer, server, dispositivi informatici e di rete, Cremonaufficio è un’azienda informatica presente nel terri- può essere una buona alternativa all’acquisto, molti infatti torio cremonese dal 1986. sono i vantaggi derivanti da questa pratica e il funziona- Sin dai primi anni si è distinta per professionalità e capacità, mento è molto semplice. Vediamo insieme alcuni di questi registrando di anno in anno un trend costante di crescita. vantaggi: Oltre venticinque anni di attività di vendita ed assistenza di Vantaggi Fiscali. Grazie al noleggio è possibile ottenere di- prodotti informatici, fotocopiatrici digitali, impianti telefonici, versi benefici in termini di fiscalità, elemento sempre molto reti dati fonia, videosorveglianza, hanno consolidato ed affer- interessante per le aziende. Noleggiare computer, server e mato l’azienda Cremonaufficio come fornitrice di prodotti e dispositivi di rete permette di ottenere una riduzione sulla servizi ad alto profilo qualitativo. tassazione annuale e il costo del noleggio è interamente Il servizio di assistenza tecnica viene svolto da tecnici interni, deducibile. specializzati nei vari settori, automuniti. Locazione operativa, non locazione finanziaria. A differen- Cremonaufficio è un operatore MultiBrand specializzato in za del leasing, il noleggio non prevede l’iscrizione ad una Information Technology ed Office Automation e, come tale, si centrale rischi, di conseguenza migliora il rating creditizio, avvale di un sistema di gestione operativa certificato ISO 9001. facilita il rapporto con le banche e l’accesso ai finanzia- menti.
    [Show full text]
  • HP Zbook Firefly 15 G8 Intel Core I7-1165G7 15.6Inch FHD AG LED 32GB 1TB SSD Quadro T500 4GB Wifi6 BT5 OPT.WWAN W10P64 W3/3/3
    SPESIFIKAATIOT 1(4) 28.6.2021 HP Zbook Firefly 15 G8 Intel Core i7-1165G7 15.6inch FHD AG LED 32GB 1TB SSD Quadro T500 4GB WiFi6 BT5 OPT.WWAN W10P64 W3/3/3 HP Zbook Firefly 15 G8 Intel Core i7-1165G7 15.6inch FHD AG LED 32GB Tuotenimike: 1TB SSD Quadro T500 4GB WiFi6 BT5 OPT.WWAN W10P64 W3/3/3 HP ZBook Firefly 15 G8 Mobile Workstation - Core i7 1165G7 / 2.8 GHz - Win 10 Pro 64-bit - 32 Gt RAM - 1 Tt SSD NVMe, TLC - 15.6" IPS 1920 x Kuvaus: 1080 (Full HD) - T500 / Iris Xe Graphics - Wi-Fi 6, Bluetooth - kbd: ruotsalainen/suomalainen EAN-koodi: 0195697572883 Valmistajan takuu: 36 kuukauden takuu paikan päällä Yleistä Tuotteen tyyppi: Kannettava tietokone Käyttöjärjestelmä: Windows 10 Pro 64-bit Edition - englanti / suomi / ruotsi Suoritin / Chipset CPU: Intel Core i7 (11.sukupolvi) 1165G7 / 2.8 GHz Max. turbonopeus: 4.7 GHz Ydinten määrä: Quad-Core Välimuisti: L3 - 12 Mt 64-bitin Computing: Kyllä Ominaisuudet: Intel Turbo Boost Technology Muisti RAM: 32 Gt (1 x 32 Gt) Tuettu enimmäismuisti 64 Gt (RAM): Tekniikka: DDR4 SDRAM - non-ECC Tukikomppania ThinkIT Oy Tel: 010 3965 100 Elimäenkatu 22 A Loutinkatu 49 A [email protected] 00510 HELSINKI 04400 JÄRVENPÄÄ www.tukikomppania.fi SPESIFIKAATIOT 2(4) 28.6.2021 Nopeus: 3200 MHz Nimellisnopeus: 3200 MHz Koko tai muoto: SO-DIMM 260-pin Paikkojen määrä: 2 Tyhjät paikat: 1 Tallennuslaitteet 1 Tt SSD M.2 2280 PCIe 3.0 x4 - NVM Express (NVMe), triple-level cell Päätallennus: (TLC) Näyttö Tyyppi: 15.6" - IPS LCD taustavalo-teknologia: LED-taustavalo Erotuskyky: 1920 x 1080 (Full HD) Leveä näyttö:
    [Show full text]
  • High-Performance and Energy-Efficient Irregular Graph Processing on GPU Architectures
    High-Performance and Energy-Efficient Irregular Graph Processing on GPU architectures Albert Segura Salvador Doctor of Philosophy Department of Computer Architecture Universitat Politècnica de Catalunya Advisors: Jose-Maria Arnau Antonio González July, 2020 2 Abstract Graph processing is an established and prominent domain that is the foundation of new emerging applications in areas such as Data Analytics, Big Data and Machine Learning. Appli- cations such as road navigational systems, recommendation systems, social networks, Automatic Speech Recognition (ASR) and many others are illustrative cases of graph-based datasets and workloads. Demand for higher processing of large graph-based workloads is expected to rise due to nowadays trends towards increased data generation and gathering, higher inter-connectivity and inter-linkage, and in general a further knowledge-based society. An increased demand that poses challenges to current and future graph processing architectures. To effectively perform graph processing, the large amount of data employed in these domains requires high throughput architectures such as GPGPU. Although the processing of large graph-based workloads exhibits a high degree of parallelism, the memory access patterns tend to be highly irregular, leading to poor GPGPU efficiency due to memory divergence. Graph datasets are sparse, highly unpredictable and unstructured which causes the irregular access patterns and low computation per data ratio, further lowering GPU utilization. The purpose of this thesis is to characterize the bottlenecks and limitations of irregular graph processing on GPGPU architectures in order to propose architectural improvements and extensions that deliver improved performance, energy efficiency and overall increased GPGPU efficiency and utilization. In order to ameliorate these issues, GPGPU graph applications perform stream compaction operations which process the subset of active nodes/edges so subsequent steps work on compacted dataset.
    [Show full text]
  • Die Meilensteine Der Computer-, Elek
    Das Poster der digitalen Evolution – Die Meilensteine der Computer-, Elektronik- und Telekommunikations-Geschichte bis 1977 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 und ... Von den Anfängen bis zu den Geburtswehen des PCs PC-Geburt Evolution einer neuen Industrie Business-Start PC-Etablierungsphase Benutzerfreundlichkeit wird gross geschrieben Durchbruch in der Geschäftswelt Das Zeitalter der Fensterdarstellung Online-Zeitalter Internet-Hype Wireless-Zeitalter Web 2.0/Start Cloud Computing Start des Tablet-Zeitalters AI (CC, Deep- und Machine-Learning), Internet der Dinge (IoT) und Augmented Reality (AR) Zukunftsvisionen Phasen aber A. Bowyer Cloud Wichtig Zählhilfsmittel der Frühzeit Logarithmische Rechenhilfsmittel Einzelanfertigungen von Rechenmaschinen Start der EDV Die 2. Computergeneration setzte ab 1955 auf die revolutionäre Transistor-Technik Der PC kommt Jobs mel- All-in-One- NAS-Konzept OLPC-Projekt: Dass Computer und Bausteine immer kleiner, det sich Konzepte Start der entwickelt Computing für die AI- schneller, billiger und energieoptimierter werden, Hardware Hände und Finger sind die ersten Wichtige "PC-Vorläufer" finden wir mit dem werden Massenpro- den ersten Akzeptanz: ist bekannt. Bei diesen Visionen geht es um die Symbole für die Mengendarstel- schon sehr früh bei Lernsystemen. iMac und inter- duktion des Open Source Unterstüt- möglichen zukünftigen Anwendungen, die mit 3D-Drucker zung und lung. Ägyptische Illustration des Beispiele sind: Berkley Enterprice mit neuem essant: XO-1-Laptops: neuen Technologien und Konzepte ermöglicht Veriton RepRap nicht Ersatz werden.
    [Show full text]
  • Rambus Buys Into CXL Interconnect Ecosystem with Two New Deals June 17 2021
    Market Insight Report Reprint Rambus buys into CXL interconnect ecosystem with two new deals June 17 2021 John Abbott The addition of PLDA and AnalogX will speed up Rambus’ initiative to enable memory expansion and pooling in disaggregated infrastructure through the emerging Compute Express Link interconnect ecosystem. CXL 2.0 is a PCIe-based technology intended to make it easier to connect CPUs with memory and specialist accelerators, and to separate memory from physical servers to improve memory bandwidth, capacity and efficiency. This report, licensed to Rambus, developed and as provided by S&P Global Market Intelligence (S&P), was published as part of S&P’s syndicated market insight subscription service. It shall be owned in its entirety by S&P. This report is solely intended for use by the recipient and may not be reproduced or re-posted, in whole or in part, by the recipient without express permission from S&P. Market Insight Report Reprint Introduction Snapshot Rambus has acquired two companies, PDLA and Acquirer Rambus AnalogX. The silicon memory IP company is looking to boost its portfolio of new products and IP through Targets PLDA, AnalogX contributing to the emerging CXL (Compute Express Subsector Semiconductor IP Link) interconnect ecosystem. CXL 2.0, launched last November, is a PCIe-based technology intended to Deal value Not disclosed make it easier to connect CPUs with memory and Date announced 16-Jun-21 specialist accelerators, and to separate memory Closing date, Q3 2021 from physical servers to improve memory bandwidth, expected capacity and efficiency. Advisers None disclosed Rambus intends to combine its existing serial connection, memory and security skills, and IP with the newly acquired IP and engineers from the two companies to produce CXL buffering chips and memory controllers, powering next-generation PCIe 6.0 and CXL 3.0 devices.
    [Show full text]
  • Digital Electronics and Analog Photonics for Convolutional Neural Networks (DEAP-Cnns) Viraj Bangari , Bicky A
    IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 26, NO. 1, JANUARY/FEBRUARY 2020 7701213 Digital Electronics and Analog Photonics for Convolutional Neural Networks (DEAP-CNNs) Viraj Bangari , Bicky A. Marquez, Member, IEEE, Heidi Miller , Alexander N. Tait , Member, IEEE, Mitchell A. Nahmias , Thomas Ferreira de Lima , Student Member, IEEE, Hsuan-Tung Peng , Paul R. Prucnal, Life Fellow, IEEE, and Bhavin J. Shastri , Senior Member, IEEE Abstract—Convolutional Neural Networks (CNNs) are powerful bottlenecks is computing the matrix multiplication required for and highly ubiquitous tools for extracting features from large forward propagation. In fact, over 80% of the total processing datasets for applications such as computer vision and natural time is spent on the convolution [1]. Therefore, techniques that language processing. However, a convolution is a computationally expensive operation in digital electronics. In contrast, neuromor- improve the efficiency of even forward-only propagation are in phic photonic systems, which have experienced a recent surge of high demand and researched extensively [2], [3]. interest over the last few years, propose higher bandwidth and In this work, we present a complete digital electronic and energy efficiencies for neural network training and inference. Neu- analog photonic (DEAP) architecture capable of performing romorphic photonics exploits the advantages of optical electronics, highly efficient CNNs for image recognition. The competitive including the ease of analog processing, and busing multiple signals on a single waveguide at the speed of light. Here, we propose a MNIST handwriting dataset [4] is used as a benchmark test for Digital Electronic and Analog Photonic (DEAP) CNN hardware our DEAP CNN.
    [Show full text]
  • ENPH 455 Final Report Ultrafast Digital Electronics and Analog Photonics (DEAP) for Convolutional Neural Networks and Image Proc
    ENPH 455 Final Report Ultrafast Digital Electronics and Analog Photonics (DEAP) for Convolutional Neural Networks and Image Processing Viraj Bangari Supervisor: Dr. Bhavin J. Shastri 2019-03-27 Department of Physics, Engineering Physics & Astronomy Queen’s University Abstract Convolutions are computationally intensive mathematical operations used in convolutional neural networks (CNNs) and image processing. Convolution operations are typically delegated to GPUs due to their ability to highly paralleliZe matrix multiplication operations. In recent years, silicon photonics has shown promise in being the next generation of computing hardware that can operate at ultrafast speeds. In particular, the neuromorphic photonic broadcast-and- weight architecture has been able to implement recurrent neural networks while operating at gigahertZ frequencies. Inspired by the principles of broadcast-and-weight, this thesis proposes two photonic architectures that are capable of performing convolution operations. The first architecture, called DEAP, is specialiZed for implementing convolutional neural networks whereas the second, DEAP-GIP, is specialiZed for general-purpose image processing tasks. DEAP is estimated to perform convolutions between 2.8 and 14 times faster than a GPU while roughly using 0.75 times the energy consumption. Additionally, DEAP-GIP is estimated to operate 4.6 to 68 times faster than a GPU while using 2.3 times the energy consumption. The largest bottlenecks for both of these architectures are from the I/O interfacing with digital systems via digital-to-analog converters (DACs) and analog-to-digital converters (ADCs). If photonic DACs [1] and ADCs [2] are to be built with higher bit-precisions, the speedup over GPUs could be even higher.
    [Show full text]
  • Imac 27''/3.1Ghz 6C/8Gb/256Gb/Rp5300-Int
    T. (+371) 27 070 075 Код товара MXWT2_INT SHOP.SELENAL.LV 2 October 2021, 02:18 IMAC 27''/3.1GHZ 6C/8GB/256GB/RP5300-INT Склад Кол-во PRICE Visas cenas norādītas ar PVN. Производитель Apple Processor Memory types supported by processor DDR4 SDRAM, GDDR6 SDRAM Software Operating system Apple macOS Catalina Technical details Gross weight 12.12 kg Net weight 8.92 kg Memory Memory form factor SODIMM 260-pin Memory clock speed (MHz) 2666 Memory slots 4 Maximum internal memory 128 GB Power AC input voltage AC 100-240 V AC input frequency 50/60 Hz Ergonomics Display type TFT Active Matrix Performance Maximum operating temperature 35 °C Minimum Operating Ambient Temperature 10 °C Graphics Discrete graphics adapter model AMD Radeon Pro 5300 Ports & interfaces Audio output Stereo USB 3.0 (3.1 Gen 1) Type-A ports quantity 4 (USB 3.0 (type A)) Ethernet LAN (RJ-45) ports 1 (RJ-45) Display Display resolution 5120 x 2880 Aspect ratio 16:9 Panel type LCD Design Colour of product Silver T. (+371) 27 070 075 Код товара MXWT2_INT SHOP.SELENAL.LV 2 October 2021, 02:18 Общие характеристики Thunderbolt 3 2 (USB Type-C) System Form Factor All-in-One Audio/Video Features Support Thunderbolt 3 digital video output Support native DisplayPort output over USB‑C Thunderbolt 2, HDMI, DVI, and VGA supported using adapters (sold separately) Supports up to two displays with 3840-by-2160 resolution at 60Hz at over... Display Form Factor Built-in Power Device Type Power Supply Maximum Operating Humidity 95 % Minimum Operating Humidity 5 % Networking Form Factor Integrated Number of Speakers 2 Headphones 1 (3.5-mm minijack) Max.
    [Show full text]
  • Microsoft Surface Katalog Nejprodávanějších Modelů
    Microsoft Surface Katalog nejprodávanějších modelů Microsoft Surface 1 Dvě zařízení v jednom Microsoft Dotyková obrazovka, nízká hmotnost a celkově výborná mobilita koncipuje Surface zařízení jako tablet. Kamera vpředu i vzadu dokonce nahradí záznamové zařízení mobilního telefonu. Pohodlná klávesnice, možnosti Software i hardware od konektivity a vysoký výkon plnohodnotných procesorů Intel nabízí Microsoftu – uzavřený kruh zase výhody notebooku či rovnou komfortu a bezpečnosti stolního PC. Pro práci v pohybu a bez omezení. To nejlepší, co svým Windows můžete dát, je Surface. Schopnosti Windows 10 a Surface zařízení jsou tak Proč? rozvíjeny pod hlavičkou jednoho jediného výrobce. Je to nejkratší cesta pro naplnění maximálních možností Unikátní vlastnosti Benefity vycházející z identického výrobce HW i SW technologických novinek a zároveň pro eliminaci vašeho pracovního nástroje jsou zřejmé. Microsoft kritických bodů v oblasti bezpečnosti celého zařízení. Bez nálepky, ale zato s vyrytým logem originálu. začal vyvíjet Surface zařízení tak, aby držela krok s To nejlepší od Microsoftu dohromady. Microsoft Surface pro Microsoft Windows. Vše ostatní technologickými požadavky pro svůj největší projekt – je pro Windows kompromisem. Windows. Displejem s poměrem stran 3:2 začala tato zařízení udávat nový směr. Důvodem je větší prostor pro Elitní mobilní výkon zobrazení informací ve vertikálním směru. Efektivní a dokonalá symbióza v době zaměřené na web. Unikátní pro použitelnost. SW a HW pod hlavičkou Speciálně vyvinutý materiál pro připojitelné klávesnice – Alcantara. Pohodlné, elegantní a s vyšší identického výrobce odolností. Unikátní pro komfort. Windows Hello pro bezpečné přihlašování bylo vyvinuto na tomto hardwaru, či chcete-li, pro tento hardware. Unikátní pro rutinní bezpečnost. Windows Autopilot – přednastavený notebook bez zásahu „ajťáků“. Když už to musí někde fungovat jako Zero-touch zařízení, je to právě u Microsoftu.
    [Show full text]
  • Ficha Técnica Resumida Ficha Técnica Completa
    HP Z2 Mini G5(12M14EA#ABE) HP Z2 Mini G5 - Mini - 1 x Core i9 10900 / 2.8 GHz - vPro - RAM 32 GB - SSD 512 GB - HP Z Turbo Drive,TLC - Quadro T2000 / UHD Graphics P630 - GigE - Win 10 Pro for Workstations 64 bits - monitor:ninguno - teclado:español Ficha Técnica Resumida Descripción del producto HP Z2 Mini G5 - mini - Core i9 10900 2.8 GHz - vPro - 32 GB - SSD 512 GB - español Tipo Estación de trabajo - mini Tecnología de plataforma Intel vPro Platform Localización Idioma:Español / región:España Procesador 1 x Intel Core i9 10ª Gen 10900 / 2.8 GHz (5.2 GHz) (10 núcleos) Socket del procesador FCLGA1200 Socket Características principales del Intel Turbo Boost Technology 3,Intel Thermal Velocity Boost Frequency procesador Memoria caché 20 MB L3 Caché por procesador 20 MB Memoria RAM 32 GB (instalados) / 64 GB (máx.) - DDR4 SDRAM - no ECC - PC4-25600 Controlador de almacenamiento SATA SSD 512 GB - PCI Express 3.0 x4 - M.2 2280 - HP Z Turbo Drive,celda de nivel triple Disco duro (TLC) Almacenamiento óptico Sin unidad óptica Configuración multi-GPU 1 tarjeta GPU sencilla / GPU integrada Controlador gráfico PCIe 3.0 x16 - NVIDIA Quadro T2000 / Intel UHD Graphics P630 Memoria de vídeo 4 GB GDDR6 SDRAM Conexión de redes GigE Teclado Español Alimentación CA 120/230 V (50/60 Hz) OS proporcionado Windows 10 Pro for Workstations 64-bit Edition Dimensiones (Ancho x 21.6 cm x 21.6 cm x 5.8 cm Profundidad x Altura) Peso 2.1 kg Estándares medioambientales TCO Certified Desktops 8 Garantía del fabricante Garantía limitada - piezas y mano de obra - 3 años - in situ Ficha Técnica Completa General Tecnología de plataforma Intel vPro Platform Tipo Estación de trabajo Factor de forma del producto Mini Dispositivos integrados Altavoz Embedded Security Trusted Platform Module (TPM 2.0) Security Chip Localización Idioma:Español / región:España Procesador / Chipset CPU Intel Core i9 10ª Gen 10900 / 2.8 GHz Velocidad turbo máx.
    [Show full text]