Artificial Intelligence Computing for Automotive Webcast for Xilinx Adapt: Automotive: Anywhere January 12Th, 2021

Total Page:16

File Type:pdf, Size:1020Kb

Artificial Intelligence Computing for Automotive Webcast for Xilinx Adapt: Automotive: Anywhere January 12Th, 2021 From Technologies to Markets Artificial Intelligence Computing for Automotive Webcast for Xilinx Adapt: Automotive: Anywhere January 12th, 2021 © 2020 AGENDA • Scope • From CPU to accelerators to platforms • Levels of autonomy • Forecasts • Trends • Ecosystem • Conclusions Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 2 SCOPE Not included in the report Robotic cars Autonomous driving Data center computing Cloud computing Performance Understand the impact of Artificial Centralized computing Intelligence on the computing Level 5 Level 4 hardware for automotive Advanced Driver- Assistance Systems Driver environment Infotainment Level 3 (ADAS) Multimedia computing Level 2 Edge computing Computing close to Gesture Speech sensor recognition recognition Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 3 FROM GENERAL APPLICATIONS TO NEURAL NETWORKS The focus for the semiconductor industry is shifting from to General Applications + Neural Networks General workloads + Deep learning workloads • Integer operations + Floating operations • Tend to be sequential in nature + Tend to be parallel in nature Parallelization is key Scalar Engines Platforms and Accelerators explaining why Few powerful cores that tackle computing Hundred of specialized cores working in this is so tasks sequentially parallel popular Only allocate a portion of transistors for Most transistors are devoted to floating floating point operations point operations Source: Deep Learning: An Artificial Intelligence Revolution by ARK Investment Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 4 GOING FURTHER WITH DEDICATED PLATFORMS AND ACCELERATORS From Applications processors to GPUs to accelerators to platforms Scalar Processing • Processes an operation per instruction • CPUs run at clock speeds in the GHz range • Might take a long time to execute large matrix operations via a sequence of scalar operations Vector Processing Graph • Same operation performed concurrently across a large number of processing at data elements at the same time, the heart of • Single Instruction Multiple Data (SIMD) neural • GPUs are effectively vector processors networks Matrix Processing • Runs many computational processes (vertices) • Calculates the effects these vertices on other points with which they interact via lines (i.e. edges) • Overall processing works on many vertices and points simultaneously • Low precision needed • Names: accelerators, neural engine, tensor processing unit (TPU), neural network processor (NNP), intelligence processing unit (IPU), vision processing unit (VPU), AI Processing Unit (AIPU)… Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 5 AUTONOMOUS VEHICLES - THE DISRUPTION CASE Two distinctive paths for autonomous vehicles Technology x Market Penetration Disruption ? Improvement Autonomous of cars as we vehicles know Automated driving Industrialization Electric car 2020 should see matures the first phase commercial implementation Robotic cars Below expectation of autonomous “cars” fulfilling needs vehicles in a new plane of consumption Yole Développement Electronics New use cases © August 2015 Invades cars 1880 80 years 1960 40 years 2000 20 years 2020 10 years 2030 5 years 2035 Acceleration : The speed of technology change doubles every technology shift ADAS vehicles Robotic vehicles Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 6 AUTONOMOUS VEHICLES - THE ROBOTIC DISRUPTION CASE Two distinctive paths for autonomous vehicle Where? Any speed Anywhere ADAS vehicle ? Autonomous driving Medium Levels became speed Is L3 relevant? marketing Anywhere New Exist? Tech giants definitions but entrants Startups we consider that 2020 they do not represent the reality. Limited distance The reality is whether it is Historical autonomous, players 2020 whether it is not Designated areas Designated Robotic vehicle places Low 2015 Taxi Shuttle/bus 2015 speed Level 1-2 Level 2+ Level 2++ Level 3? Level 4-5 How? 2015 Personal car 2022 Robotic Mobility-as-a-Service Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 7 AUTOMOTIVE MARKET TREND –YOLE MARCH 2020 2014-2050 Light vehicle sales breakdown forecast by level of autonomy (in M units) ADAS Level 0 ADAS Level 1-2 ADAS Level 2+ ADAS Level 2++ ADAS Level 3-4 ADAS Level 5 Robotic cars produced (Munits) 140 7 120 6 6 2 3 4 8 10 12 15 1 1 19 0 23 25 Covid-19 0 0 11 28 0 5 7 9 14 17 19 21 31 0 20 4 24 26 34 100 impact 10 2 28 5 0 10 34 0 9 12 14 17 19 0 0 6 8 21 23 26 37 0 10 50 28 5 0 0 13 20 3 31 40 0 13 14 34 80 9 11 14 36 39 4 15 16 18 20 22 24 39 42 27 29 44 10 31 42 21 34 453-4 45 18 19 22 25 12 37 47 60 26 10 40 47 3 41 45 10 32 2.8 33 43 44 2.6 36 38 2++43 2.4 40 24 40 43 42 2.3 ADAS ADAS light vehicles sales (Munits) 2.2 40 20 40 2+ 42 2.0 41 2 1.9 40 40 1.7 40 40 37 1.6 36 38 1.5 34 56 55 58 53 1-2 36 1.4 34 53 49 33 1.2 31 31 20 39 30 1.0 1.1 27 1 33 35 32 28 0.9 24 30 29 27 0.7 24 20 24 0.6 20 17 20 17 0.5 17 15 14 12 14 0.312 0.4 11 8 10 9 0 0.2 9 8 7 6 5 0.1 6 4 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 4 3 3 2 2 2 2 2 1 1 1 14 03 03 02 02 0 By 2035 cars with autonomy level L3-4-5 will represent 25% of global production the rest will be L1-2-2+ Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 8 ARTIFICIAL INTELLIGENCE IN AUTOMOTIVE 2015-2025e Total AI computing hardware volume forecast • Artificial Intelligence based 80 on images from cameras for automotive will generate a total volume of 70 67M(unit) chips for computing hardware in 2025 for ADAS vehicles 60 only. • Currently, we consider 50 that for one type of camera, one chip is used. 40 We can find multiple chips in one car. Fusion of inputs Volumes in Munits from multiple types of 30 cameras is not considered, as fusion is mostly used for inputs of different types of 20 sensors for now. • Ultimately, platform 10 solutions will enable a variety of vision formats 0 CAGR19- 2015 2016 2017 2018 2019 2020e 2021e 2022e 2023e 2024e 2025e and sensor inputs to 25 enable multiple strategies. Total - - 0.09 0.26 0.43 0.38 4.26 10.08 25.27 42.82 67.19 132.4% Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 9 FROM LEVEL 0 TO LEVEL 2++ ON ONE SIDE, ROBOTIC ON THE OTHER Inclusion of accelerators and multiplication of the number of chips • Level 0 to Level 2+/2++ are differentiating mostly by improved functionalities such as Automatic Emergency Braking (AEB) and some new functionalities such as Traffic Jam Assist (TJA) or Lane Keeping Assist (LKA) • On the robotic side, full autonomy was first realized in closed area at low speed (<15miles/h) to open designated area and at medium speed (<30 miles/h) Computing These improvements are realized thanks to the introduction of AI introduces AI and follows algorithms and its related hardware what has been seen in consumer Implementation of AI : following what have been done in consumer applications Implementation in SoC > as a standalone Multiplication of the number of chip of accelerators > platfoms x2 x4 x8 computing chips Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 10 FROM SENSOR SUITE TO COMPUTING SUITE FOR AUTONOMY Levels and functionalities AEB L1 AEB L2 AEB L2+ HP AP Level 1 Level 2 Level 2+/2++ Level 3/4 Level 5/Robotic ACC LKA PA TJA DM 1 5 >10 Technology penetration Radar Fusion Fusion Forward/Rear Cam. Fusion Surround Cam. LiDAR Computer Vision algorithms Deep Learning algorithms Increasing number of neural networks 2012 2016 by Tesla – 2021 for others 2030 2040 Performance and ASP MCU FPGA Centralized platform Low VP High Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 11 FOUR BUSINESS MODELS Each business model gains value from different sources Four main business models can be identified: - Car manufacturer : OEMs - Autonomous driving : Software and IP hardware:- they are developing the brains of cars and Autonomous driving all the driverless application Most of the - Car electrification : battery and net value will power train manufacturers:- they come from are turning cars into electric mobility as a powered systems service Car manufacturer Car electrification - Mobility as a service : service providers (robot taxis or shuttles):- they will offer transportation as a service But the major value flow will go Mobility as a service to the service provider: by changing consumer use they will overtake the others Artificial Intelligence Computing for Automotive | Webcast for Xilinx Adapt: Automotive: Anywhere | www.yole.fr | ©2020 12 SUMMARY Ecosystem • Because the technologies are different, ecosystems and supply chains for ADAS and robotic cars are different; • In both of these ecosystems, the supply chains are organizing; • ADAS ecosystems are built around historical automotive OEMs, though with classical supply chains going less and less through Tier 1s. Hardware for ADAS is led by MobilEye but OEM competition is tough Tier 1 • Robotic vehicles ecosystems are built around full stack solution partnerships
Recommended publications
  • EECS 498/598: Brain-Inspired Computing: Models, Architectures, and Programming
    NEW COURSE ANNOUNCEMENT FOR FALL 2019 EECS 498/598: Brain-Inspired Computing: Models, Architectures, and Programming Time: Tuesday and Thursday 1:30 to 3:00 pm Instructor: Prof. Pinaki Mazumder Phone: 734-763-2107; e-mail: [email protected] Brain-inspired computing is a subset of AI-based machine learning and is generally referred to both deep and shallow artificial neural networks (ANN) and spiking neural networks (SNN). Deep convolutional neural networks (CNN) have made pervasive market inroads in numerous commercial applications and their software implementations are widely studied in computer vision, speech processing and other courses. The purpose of this course will be to study the wide gamut of shallow and deep neural network models, the methodologies for specialized hardware design of popular learning algorithms, as well as adapting hardware architectures on crossbar fabrics of emerging technologies such as memristors and spin torque nanomagnetic devices. Existing software development tools such as TensorFlow, Caffe, and PyTorch will be leveraged to teach various aspects of neuromorphic designs. Prerequisites: Senior undergrad and grad student standing in Electrical Engineering, Computer Engineering, Computer Science or Applied Physics program. Outline: i) Fundamentals of brain-inspired computing and history of neural computing, ii) Basics of linear algebra and probability theory needed for modeling of neural networks, iii) Deep learning by convolutional neural networks such as AlexNet, VGG, GoogLeNet, and ResNet, iv) Deep Neural
    [Show full text]
  • A Deep Neural Network for On-Board Cloud Detection on Hyperspectral Images
    remote sensing Article CloudScout: A Deep Neural Network for On-Board Cloud Detection on Hyperspectral Images Gianluca Giuffrida 1,* , Lorenzo Diana 1 , Francesco de Gioia 1 , Gionata Benelli 2 , Gabriele Meoni 1 , Massimiliano Donati 1 and Luca Fanucci 1 1 Department of Information Engineering, University of Pisa, Via Girolamo Caruso 16, 56122 Pisa PI, Italy; [email protected] (L.D.); [email protected] (F.d.G.); [email protected] (G.M.); [email protected] (M.D.); [email protected] (L.F.) 2 IngeniArs S.r.l., Via Ponte a Piglieri 8, 56121 Pisa PI, Italy; [email protected] * Correspondence: [email protected] Received: 31 May 2020; Accepted: 5 July 2020; Published: 10 July 2020 Abstract: The increasing demand for high-resolution hyperspectral images from nano and microsatellites conflicts with the strict bandwidth constraints for downlink transmission. A possible approach to mitigate this problem consists in reducing the amount of data to transmit to ground through on-board processing of hyperspectral images. In this paper, we propose a custom Convolutional Neural Network (CNN) deployed for a nanosatellite payload to select images eligible for transmission to ground, called CloudScout. The latter is installed on the Hyperscout-2, in the frame of the Phisat-1 ESA mission, which exploits a hyperspectral camera to classify cloud-covered images and clear ones. The images transmitted to ground are those that present less than 70% of cloudiness in a frame. We train and test the network against an extracted dataset from the Sentinel-2 mission, which was appropriately pre-processed to emulate the Hyperscout-2 hyperspectral sensor.
    [Show full text]
  • AI EDGE up Series
    AI EDGE UP Series Jason Lu Director, PSM AAEON Centralized AI = Real Time Decision an the Edge AI at the Edge with CPUs + GPUs = Not as Efficient (Gflops/W) AI On The Edge for Robots, Drones, Portable/Mobile Devices Outdoor Devices = Efficient (Gflops/W) Reasonable Cost Industrial Grade Archit. CREDIT SIZED STACKABLE ARTIFICIAL INTELLIGENCE PLATFORM FULLY POWERED by Intel® TECHNOLOGY UP CORE PLUS X86 AI PLUS FPGA AI CORE VPU Intel® Intel® Intel® Cyclone® MOVIDIUS™ Atom x5/x7 + 10 GX + Celeron/Pentium Myriad 2 Low Power Consumption Programmable Versatile Architecture Video Processing Unit Quad Core 64 bit x86 Architecture (Hardware Neural Network) Super Rich I/O GPU integrated Optimized for Machine Learning Good for CPU Offload, and Real Time Rich I/O Real Time High Speed Signal Analysis Pattern Recognition UP CORE PLUS Credit Card Form Factor Low Power Consumption 6th Generation Atom, Celeron Stackable & Expandable Pentium Quad Core 64 bit x86 Architecture GPU integrated Low Power Consumption Intel Sensor HUB Cost Effective Solution Support of Windows 10, Linux Ubuntu, Yocto, Debian UP CORE PLUS Intel® Atom™ x5 / x7 / Celeron ®/ Pentium ® 2/4/8 GB DDRL 4 Dual Channel 2.400 MHz 32/64/128 GB eMMC DP 4K@60 Hz eDP CSI 2 Lane + CSI 4 Lane USB 3.0 Type A USB 3.0 Type B WiFi 802.11 AC 2T2R 2 x 100 pin expansion connector Compatible with UP Core expansion board 12V DC In AI PLUS Programmable Versatile Architecture Rich high speed I/O Credit Card Expansion Board Hardened Floating Point Capabilities Good for CPU Off-loading Reach I/O Real Time
    [Show full text]
  • Synopsys and Movidius First-Pass Silicon Success for Myriad 2 Vision Processing Unit with Designware USB 3.0, LPDDR3/2 & MIPI D-PHY IP
    Success Story Synopsys and Movidius First-Pass Silicon Success for Myriad 2 Vision Processing Unit with DesignWare USB 3.0, LPDDR3/2 & MIPI D-PHY IP Selecting Synopsys DesignWare IP and design tools for our low-power, high-performance Myriad 2 VPU gave us peace of mind, a seamless and mature end-to-end design process, and most importantly, first-silicon success.” Sean Mitchell Chief Operating Officer, Movidius Business Benefits Movidius is a vision processor company that designs ``Focused design resources on their core compact, high-performance, ultra low power, competencies computational imaging and vision processing chips, ``Reduced integration risk and achieved first-pass software and development tools, and reference silicon success with DesignWare IP designs. Movidius’ architecture delivers a new wave ``Received excellent technical support from an of intelligent and contextually aware experiences expert team for consumers in mobile and wearable devices, robotics, and other consumer and industrial camera applications. Overview Movidius’ Myriad 2 vision processing unit (VPU) is Challenges designed for high-performance, low-power computer vision applications including wearable devices, ``Obtain a portfolio of proven IP solutions smartphones and tablets. Its unique combination of ``Differentiate the SoC by meeting aggressive power high performance, low power, and programmability and performance budgets enhance the image processing capabilities of ``Deliver a complex 28nm design on a tight timeline devices used in indoor navigation, 3D scanning, and immersive gaming with multi-aperture and multi- spectral cameras. Synopsys Solutions DesignWare® IP including: Movidius’ focus on mobile devices makes minimizing area and power consumption while maximizing ``MIPI D-PHY performance paramount. The Myriad 2 VPU offers ``USB 3.0 PHY and Dual Role Device Controller a small footprint that delivers high computational ``Gen2 DDR multiPHY supporting LPDDR3/2 and images with low power consumption.
    [Show full text]
  • Semi Autonomous Vehicle Intelligence: Real Time Target Tracking for Vision Guided Autonomous Vehicles
    Brigham Young University BYU ScholarsArchive Theses and Dissertations 2007-03-16 Semi Autonomous Vehicle Intelligence: Real Time Target Tracking For Vision Guided Autonomous Vehicles Jonathan D. Anderson Brigham Young University - Provo Follow this and additional works at: https://scholarsarchive.byu.edu/etd Part of the Electrical and Computer Engineering Commons BYU ScholarsArchive Citation Anderson, Jonathan D., "Semi Autonomous Vehicle Intelligence: Real Time Target Tracking For Vision Guided Autonomous Vehicles" (2007). Theses and Dissertations. 869. https://scholarsarchive.byu.edu/etd/869 This Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact [email protected], [email protected]. SEMI AUTONOMOUS VEHICLE INTELLIGENCE: REAL TIME TARGET TRACKING FOR VISION GUIDED AUTONOMOUS VEHICLES by Jonathan D. Anderson A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science Department of Electrical and Computer Engineering Brigham Young University April 2007 BRIGHAM YOUNG UNIVERSITY GRADUATE COMMITTEE APPROVAL of a thesis submitted by Jonathan D. Anderson This thesis has been read by each member of the following graduate committee and by majority vote has been found to be satisfactory. Date Dah-Jye Lee, Chair Date James K. Archibald Date Doran K Wilde BRIGHAM YOUNG UNIVERSITY As chair of the candidate’s graduate committee, I have read the thesis of Jonathan D. Anderson in its final form and have found that (1) its format, citations, and bibliographical style are consistent and acceptable and fulfill university and department style requirements; (2) its illustrative materials including figures, tables, and charts are in place; and (3) the final manuscript is satisfactory to the graduate committee and is ready for submission to the university library.
    [Show full text]
  • A Comparative Analysis of Tightly-Coupled Monocular, Binocular, and Stereo VINS
    A Comparative Analysis of Tightly-coupled Monocular, Binocular, and Stereo VINS Mrinal K. Paul, Kejian Wu, Joel A. Hesch, Esha D. Nerurkar, and Stergios I. Roumeliotisy Abstract— In this paper, a sliding-window two-camera vision- few works address the more computationally demanding aided inertial navigation system (VINS) is presented in the tightly-coupled stereo VINS. To the best of our knowledge, square-root inverse domain. The performance of the system Leutenegger et al. [7] and Manderson et al. [10] present the is assessed for the cases where feature matches across the two-camera images are processed with or without any stereo only tightly-coupled stereo VINS, which operate in real- constraints (i.e., stereo vs. binocular). To support the com- time but only on desktop CPUs. Manderson et al. [10] parison results, a theoretical analysis on the information gain employ an extension of PTAM [11] where the tracking and when transitioning from binocular to stereo is also presented. mapping pipelines are decoupled, and hence is inconsistent.1 Additionally, the advantage of using a two-camera (both stereo On the other hand, Leutenegger et al. [7] propose a consistent and binocular) system over a monocular VINS is assessed. Furthermore, the impact on the achieved accuracy of different keyframe-based stereo simultaneous localization and map- image-processing frontends and estimator design choices is ping (SLAM) algorithm that performs nonlinear optimization quantified. Finally, a thorough evaluation of the algorithm’s over both visual and inertial cost terms. In order to maintain processing requirements, which runs in real-time on a mobile the sparsity of the system, their approach employs the follow- processor, as well as its achieved accuracy as compared to ing approximation: Instead of marginalizing the landmarks alternative approaches is provided, for various scenes and motion profiles.
    [Show full text]
  • Using Computer Vision in Retail Analytics
    Marcus Norrgård USING COMPUTER VISION IN RETAIL ANALYTICS MARCUS NORRGÅRD STRATACACHE OY Master of Science Thesis Supervisors: Annamari Soini, Johan Lilius Advisor: Niclas Jern Software Engineering Department of Information Technology Åbo Akademi University 2020 Marcus Norrgård ABSTRACT This thesis comprises the creation of a computer vision-based analytics platform for making age and gender-based inference for retail analytics. Computer vision in retail spaces is an emerging field with interesting opportunities for research. The thesis utilizes modern technologies to create a machine learning pipeline for training two convolutional neural networks for classifying age and gender. Furthermore, this thesis examines deployment and inference computing of the trained neural network models on a Raspberry Pi single board computer. The results presented in this thesis demonstrate the feasibility of the created solution. i Marcus Norrgård ACKNOWLEDGEMENTS I would like to thank professor Johan Lilius and my supervisor Annamari Soini for their help and guidance during the writing of this thesis. I would also like to thank my advisor Niclas Jern for his support and my employer Stratacache OY for giving me the opportunity to make this project a reality. Marcus Norrgård Turku, April 30, 2020 ii Marcus Norrgård Table of Contents ABSTRACT ....................................................................................................................... i ACKNOWLEDGEMENTS .............................................................................................
    [Show full text]
  • Cost-Effective Hardware Accelerator Recommendation for Edge Computing∗
    Cost-effective Hardware Accelerator Recommendation for Edge Computing∗ Xingyu Zhou, Robert Canady, Shunxing Bao, Aniruddha Gokhale Dept of EECS, Vanderbilt University, Nashville,TN (xingyu.zhou,robert.e.canady, shunxing.bao, a.gokhale)@vanderbilt.edu Abstract quire millions of operations for each inference [16]. Hardware accelerator devices have emerged as an alter- Thus, to maintain low latencies and compliance with native to traditional CPUs since they not only help per- the power sensitivity of edge deployment, smaller mod- form computations faster but also consume much less els [8] operating on more efficient hardware are pre- energy than a traditional CPU thereby helping to lower ferred [14]. To that end, hardware acceleration technolo- both capex (i.e., procurement costs) and opex (i.e., en- gies, such as field programmable gate arrays (FPGAs), ergy usage). However, since different accelerator tech- graphical processing units (GPUs) and application- nologies can illustrate different traits for different appli- specific integrated circuits (ASICs) among others, have cation types that run at the edge, there is a critical need shown significant promise for edge computing [20]. for effective mechanisms that can help developers select With the proliferation of different accelerator tech- the right technology (or a mix of) to use in their context, nologies, developers are faced with a significant which is currently lacking. To address this critical need, dilemma: which accelerator technology is best suited we propose a recommender system to help users rapidly for their application needs such that response times are and cost-effectively select the right hardware accelera- met under different workload variations, the overall cost tor technology for a given compute-intensive task.
    [Show full text]
  • Benchmarking of Cnns for Low-Cost, Low-Power Robotics Applications
    Benchmarking of CNNs for Low-Cost, Low-Power Robotics Applications Dexmont Pena, Andrew Forembski, Xiaofan Xu, David Moloney Abstract—This article presents the first known benchmark of Convolutional Neural Networks (CNN) with a focus on inference time and power consumption. This benchmark is important for low-cost and low-power robots running on batteries where it is required to obtain good performance at the minimal power consumption. The CNN are benchmarked either on a Raspberry Pi, Intel Joule 570X or on a USB Neural Compute Stick (NCS) based on the Movidius MA2450 Vision Processing Unit (VPU). Inference performance is evaluated for TensorFlow and Caffe running either natively on Raspberry Pi or Intel Joule and compared against hardware acceleration obtained by using the NCS. The use of the NCS demonstrates up to 4x faster inference time for some of the networks while keeping a reduced power Fig. 1: Embedded environments used to evaluate the inference consumption. For robotics applications this translates into lower time and power consumption of the CNNs. From left to right: latency and smoother control of a robot with longer battery NCS, Intel Joule 570X, Raspberry Pi 3 Model B. life and corresponding autonomy. As an application example, a low-cost robot is used to follow a target based on the inference obtained from a CNN. bedded platforms. Then we show an application of a low-cost follower robot. I. INTRODUCTION Recent advances in machine learning have been shown to II. RELATED WORK increase the ability of machines to classify and locate objects Current benchmarks for Deep Learning Software tools limit using Convolutional Neural Networks (CNN).
    [Show full text]
  • Exploring the Vision Processing Unit As Co-Processor for Inference
    Exploring the Vision Processing Unit as Co-processor for Inference Sergio Rivas-Gomez1, Antonio J. Pena˜ 2, David Moloney3, Erwin Laure1, and Stefano Markidis1 1KTH Royal Institute of Technology 2Barcelona Supercomputing Center (BSC) 3Intel Ireland Ltd. Abstract—The success of the exascale supercomputer is of the exascale supercomputer that we consider the embrace- largely debated to remain dependent on novel breakthroughs in ment of these developments in the near-term future. technology that effectively reduce the power consumption and thermal dissipation requirements. In this work, we consider the In this work, we set the initial steps towards the integra- integration of co-processors in high-performance computing tion of low-power co-processors on HPC. In particular, we (HPC) to enable low-power, seamless computation offloading of analyze the so-called Vision Processing Unit (VPU). This certain operations. In particular, we explore the so-called Vision type of processor emerges as a category of chips that aim to Processing Unit (VPU), a highly-parallel vector processor with provide ultra-low power capabilities, without compromising a power envelope of less than 1W. We evaluate this chip during inference using a pre-trained GoogLeNet convolutional performance. For this purpose, we explore the possibilities network model and a large image dataset from the ImageNet of the Movidius Myriad 2 VPU [13], [14] during inference in ILSVRC challenge. Preliminary results indicate that a multi- convolutional networks, over a large image dataset from the VPU configuration provides similar performance compared to ImageNet ILSVRC 2012 challenge [15]. In our evaluations, reference CPU and GPU implementations, while reducing the we use a pre-trained network from the Berkeley Vision and thermal-design power (TDP) up to 8× in comparison.
    [Show full text]
  • HETEROGENEOUS COMPUTING for AI at the EDGE Webinar Pasca Sarjana Terapan PENS 2 September 2020 Dr
    HETEROGENEOUS COMPUTING FOR AI AT THE EDGE Webinar Pasca Sarjana Terapan PENS 2 September 2020 Dr. –Ing. Arif Irwansyah, S.T., M.Eng. EEPIS-AI CURRICULUM VITAE Arif Irwansyah [email protected] ■ PENDIDIKAN : – S1 T. Elektro ITS, Surabaya – S2 Universitas Teknologi Malaysia – S3 Universitaet Bielefeld, Jerman ■ Pengalaman: – Asisten Peneliti di UTM Malaysia (2007-2009) – Asisten Peneliti di Univ. Paderborn Jerman (2010- 2011) – Asisten Peneliti di Univ. Bielefeld Jerman (2012 – 2016) – Dosen dan Peneliti di PENS (2001 sd saat ini) ■ Bidang Penelitian: – Embedded Vision – Reconfigurable Hardware (FPGA) – Heterogeneous Embedded Systems Background ■ Nowadays, Many applications are required high performance computations. The AI Car Computer for Self-Driving Vehicles Drones Robotics Video Analytic EEPIS-AI Cloud Platform EEPIS-AI Benefits: Faster response Enhanced security Improved mobility Lower communication costs Courtesy: ADLINK Why Heterogeneous Computing is Needed? ■ There are some application's that cannot be fulfilled when using only one type of computing core. ■ Typical Constraints: – Performance / speed – Power Consumptions – Dimension / Size EEPIS-AI ■ Growth in processor performance EEPIS-AI EEPIS-AI Heterogeneous Computing – Systems that employ more than one different computing cores (architecture) – These are multi-core system that gain performance not just by adding cores, but also by incorporating specialized processing capabilities to handle particular tasks. – Diverse types of processors or hardware accelerators cooperate to accelerate the computational tasks – Potentially reduce the power consumption and maximize the computing performance EEPIS-AI HETEROGENEOUS COMPUTING: EEPIS-AI Central processing unit (CPU) Graphics processing unit (GPU) Field programmable gate array (FPGA) Application-specific integrated circuit (ASIC) Application-specific integrated circuit (ASIC) ASICs are custom logic chips designed using a manufacturer’s circuit libraries.
    [Show full text]
  • Intel Vision Products
    INTEL IS ACCELERATING COMPUTER VISION AI Solutions for Any Business, in Any Industry The Evolving Landscape of Artificial Intelligence in Business Our world is changing. And with it, so too is the way we think about the capabilities of the Internet of Things. What began as a conversation about connecting the unconnected has turned into a larger conversation about data—how connected devices can capture it and how businesses can analyze it to make better, faster decisions. It’s not surprising, then, that this emphasis on the importance of data has resulted in technology that has evolved to usher in an era of AI that is not only changing the way we interact with one another, but also the way that companies interact with us. In order to provide the market with the most innovative products and software possible, businesses today require solutions that help them react quickly to the needs of their customers. While traditionally the primary use cases for artificial intelligence have been solely in data centers, where applications have been focused on better managing infrastructure or on monitoring, analyzing, and eventually improving enterprise operations, AI is no longer so limited. Instead, with Intel® Vision Products, a portfolio of acceleration tools designed specifically for AI at the edge, businesses can take advantage of near real-time information to help them make better decisions in a timely manner. Business Brief | Intel is Accelerating Computer Vision The Important Role of Computer Vision Intel® Vision Products: Powering AI from in Artificial Intelligence Edge to Cloud AI is possible because of the convergence of an immense Intel’s diverse portfolio simplifies the complexity of developing amount of data and increased, sophisticated compute.
    [Show full text]