Characterizing the Deployment of Deep Neural Networks on Commercial Edge Devices
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Synopsys Insight
Synopsys Insight DESIGNING A SMARTER FUTURE FROM SILICON TO SOFTWARE Subscribe Issue 2 | 2016 Smart, Secure Everything …from Silicon to Software EDA Tools Semiconductor IP Software Integrity Verify Software Bring-Up IP for the Era of FinFET and Software Is Everywhere — And on Your IC Design Before Smart Designs So Are the Vulnerabilities Sending It to the Fab From Silicon to Software: A Agile Development for How to Create a Testbench Quick Guide to Securing IoT Application Security Managers With Synopsys Verification IP Edge Devices and UVM in 5 Steps Addressing Physical Design Challenges in the Age of Customer Spotlight Additional Resources FinFET and Smarter Design Multi-Level Physical Hierarchy Movidius Keynotes IC Compiler Events Floorplanning vs. Two-Level II Technology Symposium With Floorplanning Vision of Smart, Connected Seminars Machines and Networks Webinars Synopsys Users Group (SNUG) Contact us: [email protected] Customer Spotlight Movidius Keynotes IC Compiler II Technology Symposium With Vision of Smart, Connected Machines and Networks Mike Santarini, Director of Corporate Communications Strategy, Synopsys “Is it a Chihuahua’s face or a blueberry muffin?” That is one of many problems that can be solved when deep neural networks are implemented close to the image sensors at the network’s edge, where SoCs with smart vision sensing can perform low-power image classification—a system envisioned by the high-tech industry now striving to bring higher intelligence to everything connected. In the keynote at the Synopsys IC Compiler II Technology Symposium (where Intel, Samsung, Broadcom, Mellanox and ecosystem partners ARM and TSMC also presented), Movidius’ vice president of marketing, Gary Brown, captivated the audience of 250 IC design experts by laying out Movidius’ vision of how a mix of innovative architectural design and advanced IC design tools (such as Synopsys’ IC Compiler II™) is enabling companies like Movidius to implement SoCs with greater algorithmic intelligence, advanced vision sensing, and deep learning techniques. -
Download Apps in Their Smartphones to Interact with Eot Devices
sensors Article Eyes of Things Oscar Deniz 1,*, Noelia Vallez 1, Jose L. Espinosa-Aranda 1, Jose M. Rico-Saavedra 1, Javier Parra-Patino 1, Gloria Bueno 1, David Moloney 2, Alireza Dehghani 2, Aubrey Dunne 2, Alain Pagani 3, Stephan Krauss 3, Ruben Reiser 3, Martin Waeny 4, Matteo Sorci 5, Tim Llewellynn 5, Christian Fedorczak 6, Thierry Larmoire 6, Marco Herbst 7, Andre Seirafi 8 and Kasra Seirafi 8 1 VISILAB, University of Castilla-La Mancha, E.T.S.I.Industriales, Avda Camilo Jose Cela s/n, Ciudad Real 13071, Spain; [email protected] (N.V.); [email protected] (J.L.E.-A.); [email protected] (J.M.R.-S.); [email protected] (J.P.-P.); [email protected] (G.B.) 2 Movidius, 1st Floor, O’Connell Bridge House, D’Olier Street, Dublin 2, Ireland; [email protected] (D.M.); [email protected] (A.D.); [email protected] (A.D.) 3 DFKI, Augmented Vision Research Group, Tripstaddterstr. 122, 67663 Kaiserslautern, Germany; [email protected] (A.P.); [email protected] (S.K.); [email protected] (R.R.) 4 Awaiba, Madeira Tecnopolo, 9020-105 Funchal, Portugal; [email protected] 5 nViso SA, PSE-D, Site EPFL, CH-1015 Lausanne, Switzerland; [email protected] (M.S.); [email protected] (T.L.) 6 THALES Communications & Security, 4 Avenue des Louvresses, 92230 Gennevilliers, France; [email protected] (C.F.); [email protected] (T.L.) 7 Evercam, 6-7 Granby Row, Dublin 1, D01 FW20, Ireland; [email protected] 8 Fluxguide, Burggasse 7-9/9, 1070 Vienna, Austria; andre@fluxguide.com (A.S.); kasra@fluxguide.com (K.S.) * Correspondence: [email protected] Academic Editors: Luca Roselli, Federico Alimenti and Stefania Bonafoni Received: 30 March 2017; Accepted: 17 May 2017; Published: 21 May 2017 Abstract: Embedded systems control and monitor a great deal of our reality. -
Analysis of the Movidius Platform Architecture From
D3.5 Test sets and results with Factor H2020-643924-EoT Form Board Horizon 2020 PROGRAMME ICT-01-2014: Smart Cyber-Physical Systems This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 643924 D3.5 Test sets and results with Factor Form Board Copyright © 2015 The EoT Consortium The opinions of the authors expressed in this document do not necessarily reflect the official opinion of EOT partners or of the European Commission. Page 1 of 40 25/9/2017 D3.5 Test sets and results with Factor H2020-643924-EoT Form Board 1. DOCUMENT INFORMATION Deliverable Number D3.5 Deliverable Name Test sets and results with Factor Form Board Authors J. Parra (UCLM), N. Vallez (UCLM), J. M. Rico (UCLM), J. L. Espinosa-Aranda (UCLM), A. Dehghani (MOVIDIUS), S. Krauss (DFKI), R. Reiser (DFKI), A. Pagani (DFKI) Responsible Author Oscar Deniz (UCLM) e-mail:[email protected] phone: +34 926295300 Ext.6286 Keywords EoT Firmware, Tests, EoT Factor form board WP WP3 Nature R Dissemination Level PU Planned Date 01.06.2016 Final Version Date 25.09.2017 Reviewed by O. Deniz (UCLM) Verified by C. Fedorczak (THALES) Page 2 of 40 25/9/2017 D3.5 Test sets and results with EoT H2020-643924-EoT prototypes 2. DOCUMENT HISTORY Person Date Comment Version O. Deniz 30.05.2017 Initial 0.1 J. Parra-Patiño 14.07.2017 0.2 N. Vallez 04.09.2017 0.3 J.L. Espinosa- 25.09.2017 0.4 Aranda O. Deniz 25.09.2017 Final 0.5 Page 3 of 40 25/9/2017 D3.5 Test sets and results with EoT H2020-643924-EoT prototypes 3. -
To Bridge Neural Network Design and Real-World Performance: a Behaviour Study for Neural Networks
TO BRIDGE NEURAL NETWORK DESIGN AND REAL-WORLD PERFORMANCE:ABEHAVIOUR STUDY FOR NEURAL NETWORKS Xiaohu Tang 1 2 Shihao Han 3 2 Li Lyna Zhang 2 Ting Cao 2 Yunxin Liu 2 ABSTRACT The boom of edge AI applications has spawned a great many neural network (NN) algorithms and inference platforms. Unfortunately, the fast pace of development in their fields have magnified the gaps between them. A well-designed NN algorithm with reduced number of computation operations and memory accesses can easily result in increased inference latency in real-world deployment, due to a mismatch between the algorithm and the features of target platforms. Therefore, it is critical to understand the behaviour characteristics of NN design space on target platforms. However, none of existing NN benchmarking or characterization studies can serve this purpose. They only evaluate some sparse configurations in the design space for the purpose of platform optimization rather than the scaling in every design dimension for NN algorithm efficiency. This paper presents the first empirical study on the NN design space to learn NN behaviour characteristics on different inference platforms. The revealed characteristics can be used as guidelines to design efficient NN algorithms. We profile ten-thousand configurations from a cutting-edge NN design space on seven industrial edge AI platforms. Seven key findings as well as their causes and implications for efficient NN design are highlighted. 1 INTRODUCTION Numerous edge devices such as mobile phones, cameras and speakers provide real-world usage scenarios for NN technology. To enable affordable NN inference on edge devices, remarkable innovations have been achieved on the design of efficient NN algorithms and the development of inference platforms (including hardware accelerators and the corresponding inference frameworks on top). -
D4.1 Report on State of the Art of Novel Compute Elements and Gap Analysis in MPI and GASPI
Ref. Ares(2019)558413 - 31/01/2019 HORIZON 2020 TOPIC FETHPC-02-2017 Transition to Exascale Computing Exascale Programming Models for Heterogeneous Systems 801039 D4.1 Report on state of the art of novel compute elements and gap analysis in MPI and GASPI WP4: Productive computing with FPGAs, GPUs and low-power microprocessors Date of preparation (latest version): [DATE] Copyright© 2018-2021 The EPiGRAM-HS Consortium The opinions of the authors expressed in this document do not necessarily reflect the official opinion of the EPiGRAM-HS partners nor of the European Commission. D4.1: Report on state of the art of novel compute elements and gap analysis in MPI and GASPI 2 DOCUMENT INFORMATION Deliverable Number D4.1 Report on state of the art of novel compute elements Deliverable Name and gap analysis in MPI and GASPI Due Date 31/01/2019 (PM5) Deliverable Lead FhG Martin Kühn, Dominik Loroch, Carsten Lojewski, Valeria Bartsch (Fraunhofer ITWM), Gilbert Netzer (KTH), Authors Daniel Holmes (EPCC), Alan Stokes (EPCC), Oliver Brown (EPCC) Valeria Bartsch, Fraunhofer ITWM, Responsible Author [email protected] Keywords Novel compute, programming models WP/Task WP4 /Task 4.3 Nature R Dissemination Level PU Final Version Date 31/01/2019 Sergio Rivas-Gomez (KTH), Oliver Brown (EPCC), Luis Reviewed by Cebamanos (EPCC) MGT Board Approval YES D4.1: Report on state of the art of novel compute elements and gap analysis in MPI and GASPI 3 DOCUMENT HISTORY Partner Date Comment Version First draft (Multi/Many core, GPU gap 0.1 FhG 04.12.2018 -
Embedded Deep Learning for Vehicular Edge Computing
Embedded Deep Learning for Vehicular Edge Computing Jacob Hochstetler∗, Rahul Padidela∗, Qi Chen∗, Qing Yangy and Song Fuy ∗fJacobHochstetler, RahulPadidela, [email protected] yfQing.Yang, [email protected] Department of Computer Science and Engineering University of North Texas Denton, TX, USA Abstract—The accuracy of object recognition has been greatly Vision Processing Unit (VPU). Our results show that the a improved due to the rapid development of deep learning, but mobile edge device assisted by a VPU is able to process the deep learning generally requires a lot of training data and video using a popular NN in real-time. This is important for a the training process is very slow and complex. In this work, an Intel Movidius™ Neural Compute Stick along with Raspberry CAV, since it leaves the main CPU and memory free for V2X Pi 3 Model B is used to analyze the objects in the real time communication or other tasks. images and videos for vehicular edge computing. The results shown in this study tells how the stick performs in conjunction II. MOBILE EDGE COMPUTING SETUP with different operating systems and processing power. Index Terms—Edge computing, Deep learning, Embedded A. Mobile Edge Device computing, Object detection, Vehicular system. The Raspberry Pi (RPi) is a small, single-board computer I. INTRODUCTION (SBC) [7]. Each RPi is based around a Broadcom system The need for low-power but capable inference processors on a chip (SoC), which contains the ARM CPU, RAM, has produced a new class of computing called edge, or GPU and general purpose input/output controllers (GPIO). -
Survey of Machine Learning Accelerators
1 Survey of Machine Learning Accelerators Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner MIT Lincoln Laboratory Supercomputing Center Lexington, MA, USA freuther,pmichaleas,michael.jones,vijayg,sid,[email protected] Abstract—New machine learning accelerators are being an- nounced and released each month for a variety of applications from speech recognition, video object detection, assisted driving, and many data center applications. This paper updates the survey of of AI accelerators and processors from last year’s IEEE- HPEC paper. This paper collects and summarizes the current accelerators that have been publicly announced with performance and power consumption numbers. The performance and power values are plotted on a scatter graph and a number of dimensions and observations from the trends on this plot are discussed and analyzed. For instance, there are interesting trends in the plot regarding power consumption, numerical precision, and inference versus training. This year, there are many more announced accel- erators that are implemented with many more architectures and technologies from vector engines, dataflow engines, neuromorphic Fig. 1. Canonical AI architecture consists of sensors, data conditioning, designs, flash-based analog memory processing, and photonic- algorithms, modern computing, robust AI, human-machine teaming, and users based processing. (missions). Each step is critical in developing end-to-end AI applications and systems. Index Terms—Machine learning, GPU, TPU, dataflow, accel- erator, embedded inference, computational performance This actionable knowledge is then passed to human beings I. INTRODUCTION for decision-making processes in the human-machine teaming phase. The phase of human-machine teaming provides the It has become apparent that researching, developing and users with useful and relevant insight turning knowledge into deploying Artificial Intelligence (AI) and machine learning actionable intelligence or insight. -
The Movidius Stick Optimizing Deep Neural Networks for Deployment On
Optimizing deep neural networks for deployment on the Movidius stick Emiel Deprost Student number: 01503250 Supervisors: Prof. dr. ir. Pieter Simoens, Prof. dr. ir. Bart Dhoedt Counsellor: ing. Sam Leroux Master's dissertation submitted in order to obtain the academic degree of Master of Science in de industriële wetenschappen: elektronica-ICT Academic year 2018-2019 Optimizing deep neural networks for deployment on the Movidius stick Emiel Deprost Student number: 01503250 Supervisors: Prof. dr. ir. Pieter Simoens, Prof. dr. ir. Bart Dhoedt Counsellor: ing. Sam Leroux Master's dissertation submitted in order to obtain the academic degree of Master of Science in de industriële wetenschappen: elektronica-ICT Academic year 2018-2019 Voorwoord Als laatste jaarsstudent wou ik mijn kennis uitbreiden over deep learning. Machine learning kwam al aanbod gedurende mijn opleiding en wekte mijn interesse. Dankzij mijn masterproef heb ik mij in dit onderwerp kunnen verdiepen en is dit een zeer leerzame periode geweest. Aan het einde van deze vierjarige opleiding is het dan ook het ideale moment om iedereen te bedanken. Mijn dank gaat in de eerste plaats uit naar Sam Leroux voor zijn waardevolle begeleiding bij mijn masterproef. Hij heeft mij van het begin tot het einde bijgestaan door talrijke raadgevingen en feedback bij de verkregen resultaten. Graag wil ik ook prof. dr. ir. Pieter Simoens en prof. dr. ir. Bart Dhoedt bedanken voor het thesis onderwerp dat ik aangeboden kreeg en de kans om deze masterproef te maken. Ook wil ik mijn dank betuigen aan de docenten die ons opgeleid hebben in de voorbije vier jaar. Tenslotte wil ik mijn ouders bedanken die mij de kans gegeven hebben om deze boeiende oplei- ding te volgen en telkens klaar stonden om mij te helpen. -
Reimagined Machine Vision with On-Camera Deep Learning
CASE STUDY IndustryAI-Assisted Machine Vision SolutionFLIR Focus Area Reimagined Machine Vision with On-Camera Deep Learning The FLIR® Firefly® camera adds a new level of intelligence to machine vision and image analysis supporting on-device inference with an Intel® Movidius™ Myriad™ 2 vision processing unit. Automated analysis of images captured by cameras is a key part of our day-to-day lives that we may not often think about. The quality and affordability of the phones in our pockets, the cars we drive, and the food on our plates are made possible by machines using cameras for process automation, quality inspection, and robot guidance. Without this kind of “machine vision,” the high speed and high volume required for these tasks would make them too tedious and error-prone for humans to handle reliably or affordably. As this technology has developed, machine vision has gone from enabling basic inspection and sorting operations to more complex tasks, such as guiding manufacturing robots in automotive factories and enhancing surveillance applications. Still, there has been a hard limit on the capabilities of these systems because they rely on pre-established rules. For example, machine vision has been well suited to reading a standardized barcode or checking a manufactured part against specifications, but not to the subjective judgment of whether a piece of fruit is of export quality. Neural networks trained using modern deep learning techniques take image analysis to the next level, being able to recognize objects or people with a high degree of accuracy, for example. Using machine-learning approaches, neural networks can be trained on large data sets, enabling highly accurate decision making. -
On-Board Volcanic Eruption Detection Through Cnns and Satellite Multispectral Imagery
remote sensing Technical Note On-Board Volcanic Eruption Detection through CNNs and Satellite Multispectral Imagery Maria Pia Del Rosso 1,*,†, Alessandro Sebastianelli 1,† , Dario Spiller 2,† , Pierre Philippe Mathieu 2,† and Silvia Liberata Ullo 1,† 1 Engineering Department, University of Sannio, 82100 Benevento, Italy; [email protected] (A.S.); [email protected] (S.L.U.) 2 F-Lab, ASI-ESA, 00044 Frascati, Italy; [email protected] (D.S.); [email protected] (P.P.M.) * Correspondence: [email protected] † These authors contributed equally to this work. Abstract: In recent years, the growth of Machine Learning (ML) algorithms has raised the number of studies including their applicability in a variety of different scenarios. Among all, one of the hardest ones is the aerospace, due to its peculiar physical requirements. In this context, a feasibility study, with a prototype of an on board Artificial Intelligence (AI) model, and realistic testing equipment and scenario are presented in this work. As a case study, the detection of volcanic eruptions has been investigated with the objective to swiftly produce alerts and allow immediate interventions. Two Convolutional Neural Networks (CNNs) have been designed and realized from scratch, showing how to efficiently implement them for identifying the eruptions and at the same time adapting their complexity in order to fit on board requirements. The CNNs are then tested with experimental hardware, by means of a drone with a paylod composed of a generic processing unit (Raspberry PI), an AI processing unit (Movidius stick) and a camera. The hardware employed to build the prototype Citation: Del Rosso, M.P.; is low-cost, easy to found and to use.