<<

September 2, 2020 @qualcomm_tech

Pushing the boundaries of AI research Technologies, Inc. Our AI leadership Over a decade of cutting-edge AI R&D, speeding up commercialization and enabling scale

Research face MWC demo showcasing Collaboration Brain Corp Qualcomm® Power efficiency Mobile AI Enablement Center detection with photo sorting and with on raises $114M Artificial Intelligence gains through compression, in Taiwan to open deep learning handwriting recognition TensorFlow Research initiated quantization, Deep-learning and compilation based AlexNet wins ImageNet competition

Research in spiking Research artificial neural Opened Qualcomm Announced Qualcomm Gauge neural networks processing architectures Research Netherlands Facebook Caffe2 Technologies ships equivariant support ONNX supported CNNs by , Facebook,

2009 2013 2015 2016 2017 2018 2019 2020

Acquired EuVision Qualcomm 2007 Opened joint Technologies Investment and research lab researchers AI Model Efficiency Qualcomm Research collaboration with Completed Brain Corp with University Acquired win best paper Toolkit (AIMET) initiates first AI project Brain Corp joint research of Amsterdam Scyfer at ICLR open sourced

Qualcomm® Qualcomm® 3rd Gen Neural Vision Snapdragon Processing Snapdragon 660 Intelligence Automotive Snapdragon Qualcomm® Snapdragon Consistent AI R&D SDK Snapdragon 630 Platform Cockpit 665, 730, 730G Ride™ Platform

investment is the foundation for product leadership 2nd Gen 3rd Gen 4th Gen Qualcomm 5th Gen Qualcomm AI Engine Qualcomm AI Engine Snapdragon AI Engine Qualcomm® Qualcomm AI Engine (Snapdragon 835) (Snapdragon 845) 710 (Snapdragon 855) Cloud AI 100 (Snapdragon 865) Qualcomm Artificial Intelligence Research is an initiative of Qualcomm Technologies, Inc. , Qualcomm Neural Processing SDK, Qualcomm Vision Intelligence 1st Gen Qualcomm® AI Engine Qualcomm® QCS400 Qualcomm® Robotics Platform, Qualcomm AI Engine, Qualcomm Cloud AI, Qualcomm Snapdragon Ride, Qualcomm (Qualcomm® Snapdragon™ 820 Mobile Platform) (First audio SoC) RB5 Platform Robotics RB3 Platform, and Qualcomm QCS400I are products of Qualcomm Technologies, Inc. and/or its subsidiaries. AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc. 3 Advancing AI Perception Object detection, speech IoT/IIoT research to make recognition, contextual fusion efficient AI ubiquitous Edge cloud Power efficiency Personalization Efficient learning Reasoning Model design, Continuous learning, Robust learning Scene understanding, language Automotive compression, quantization, contextual, always-on, through minimal data, understanding, behavior prediction algorithms, efficient privacy-preserved, unsupervised learning, hardware, tool distributed learning on-device learning

A platform to scale AI Action Reinforcement learning Cloud across the industry for decision making

Mobile

4 Leading research and development Across the entire spectrum of AI

Fundamental Applied research research

Deep Neural Graph and Deep AI Model G-CNN transfer network kernel learning for Efficiency Toolkit learning compression optimization graphics

Bayesian Deep Neural Compute Source CV DL for combinatorial generative network Voice UI in memory compression new sensors optimization models quantization

Bayesian Hybrid Hardware- Video Power distributed reinforcement aware Fingerprint recognition management learning learning deep learning and prediction

AI for Energy-efficient AI for Quantum AI wireless perception radar

5 Trained neural network model

New Inference input data output

Quantization Automated reduction in precision of weights Promising results show that and activations while maintaining accuracy low-precision integer inference can become widespread Models trained at Inference at Virtually the same accuracy between a high precision lower precision FP32 and quantized AI model through: Increase in performance 01010101 01010101 01010101 01010101 01010101 per watt from savings in • Automated, data free, post-training methods > 1 4x memory and compute • Automated training-based mixed-precision method 32-bit Floating point 8-bit Integer 3452.3194 3452

1: FP32 model compared to a INT8 quantized model

Leading research to efficiently quantize AI models 6 Pushing the limits of what’s possible with quantization

Data-free quantization AdaRound Bayesian bits How can we make quantization Is rounding to the nearest value the best Can we quantize layers to different bit widths based on as simple as possible? approach for quantization? precision sensitivity? Created an automated method that addresses Created an automated method for finding Created a novel method to learn bias and imbalance in weight ranges: the best rounding choice: mixed-precision quantization: No training No training Training required Data free Minimal unlabeled data Training data required Jointly learns bit-width precision and pruning

SOTA 8-bit results SOTA 4-bit weight results SOTA mixed-precision results Making 8-bit weight quantization ubiquitous Making 4-bit weight quantization ubiquitous Automating mixed-precision quantization and enabling the tradeoff between accuracy and kernel bit-width

Increase in performance per watt Increase in performance per watt Increase in performance per watt while only losing 0.5% of accuracy while only losing 2.5% of accuracy while only losing 0.8% of accuracy >4x against FP32 MobileNet V2 >8x against FP32 MobileNet V2 >8x against FP32 MobileNet V2

Data-Free Quantization Through Weight Equalization and Up or Down? Adaptive Rounding for Post-Training Quantization Bayesian Bits: Unifying Quantization and Pruning Bias Correction (Nagel, van Baalen, et al., ICCV 2019) (Nagel, Amjad, et al., ICML 2020) van Baalen, Louizos, et al., arXiv 2020) 7 Leading quantization research and fast commercialization Driving the industry towards integer inference and power-efficient AI

Quantization Relaxed Quantization Data-free Quantization AdaRound Bayesian Bits research (ICLR 2019) (ICCV 2019) (ICML 2020) (arXiv 2020)

Quantization commercialization

Qualcomm Neural Processing SDK Quantization open-sourcing

AI Model Efficiency Toolkit (AIMET)

Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries. AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc. 8 AIMET makes AI models small Open-sourced GitHub project that includes state-of-the-art quantization and compression techniques from Qualcomm AI Research

Trained AI Model Efficiency Toolkit Optimized AI model (AIMET) AI model

Compression

Quantization

TensorFlow or PyTorch

If interested, please join the AIMET GitHub project: https://github.com/quic/aimet

Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc. AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.

State-of-the-art Features: State-of-the-art Support for both Benchmarks and tests Developed by network quantization tools TensorFlow for many models professional software compression tools and PyTorch developers

9 An increasing demand Developing AI approaches for energy efficient to reduce unnecessary video processing computation Valuable information is being extracted from video Recognizing redundancy between video frames streams across diverse devices and use cases so that the same thing is never computed twice

Enhanced perception through object Advanced camera features, like video detection and semantic segmentation enhancement and super-resolution

Feature maps over time for ResNet18 remain mostly constant

Advanced video understanding, Increased video compression to Tremendous redundancy across frames! like search and surveillance address the demand for rich media

10 Conditional computation using gated networks Introduce gates with low computation cost to skip unnecessary computation in neural networks

Problem State-of-the-art accuracy while reducing A large portion of the neural network is computation by up to 3X not necessary for the prediction,

wasting computations 0.76 Conv + bn Solution Conv + bn 0.74 + Conditional channel-gated networks for task-aware continual learning 0.72

• Dynamically select the filters 0.70 conditioned on the task and input G ResNet baseline (18,34,50) • Automatically infer the task from 0.68 Convnet-FBS18

gating patterns Convnet-AIG50

1 Accuracy 1 - 0.66 Convnet-SIG34

Gating module Top Convnet-AIG18 0.64 ResNet50-BAS (ours) 퐴푣푒. 푝표표푙푖푛푔 2 푙푎푦푒푟 푀퐿푃 ResNet34-BAS (ours) + 0.62 ResNet18-BAS (ours) 퐻 × 푊 × 퐶 0.60 1 1.5 2 2.5 3 3.5 4 Input feature map 1 × 1 × 퐶 1 × 1 × 퐶′ 1 × 1 × 퐶′ MAC (x109)

1. Bejnordi, Babak Ehteshami, Tijmen Blankevoort, and Max Welling. "Batch shaping for learning conditional channel gated networks." ICLR (2020). 2. Abati, Davide, Jakub Tomczak, Tijmen Blankevoort, Simone Calderara, Rita Cucchiara, and Babak Ehteshami Bejnordi. "Conditional Channel Gated Networks for Task-Aware Continual Learning." CVPR (2020). 11 Generative Powerful Broad models capabilities applications

Variational auto Extract features by Speech/video compression Deep generative encoder (VAE)* learning a low-dimension model research feature representation Text to speech for unsupervised Generative adversarial Sampling to generate, learning network (GAN) restore, predict, or compress data Graphics rendering Given unlabeled training data, generate new samples from the Auto-regressive same distribution Computational photography Invertible Voice UI

* VAE first introduced by D. Kingma and M. Welling in 2013 12 Input unlabeled Example use case: data, such as encoder/decoder images

Generative models Encoder part

Variational auto Deep generative encoder (VAE)* model research Extract features by Sampling to generate, learning a low-dimension restore, predict or feature representation compress data for unsupervised Generative adversarial learning network (GAN)

Given unlabeled training data, Decoder part generate new samples from the Auto-regressive same distribution

Invertible

Desired output as close as possible to input

* VAE first introduced by D. Kingma and M. Welling in 2013 13 Novel machine learning-based video codec research Neural network based I-frame and P-frame compression

Deep End-to-end deep learning neural net

Entropy bits Entropy T=1 I-frame encoding decoding I-frame Ground truth encoder decoder

Warp Reconstruction Entropy bits Entropy T=2 Motion P-frame encoding decoding P-frame estimation encoder decoder Sum

Motion vector

Warp Entropy bits Entropy T=3 Motion P-frame encodingMotion decoding P-frame Residual estimation encoder estimation decoder Sum

Feedback Recurrent Autoencoder for Video Compression (Golinski, et al., arXiv 2020) Video used in images is produced by , with CC BY-NC-ND 4.0 license: https://media.xiph.org/video/derf/ElFuente/Netflix_Tango_Copyright.txt

Achieving state-of-the-art rate-distortion compared with other learned video compression solutions

14 Applying AI to solve difficult wireless challenges Deep wireless domain knowledge is required to optimally use AI capabilities

Wireless challenges AI strengths

Hard-to-model problems Learning representations for hard-to-model problems

Computational infeasibility Training policies of optimal solution and computationally realizable solutions

Efficient modem Learning to compensate parameter optimization for non-linearities AI-enhanced wireless communications 15 Scattering RF splits or is an aggregate of many reflections due to complex surfaces

Applying AI to RF for centimeter-accurate positioning Precise indoor positioning is valuable across Diffraction industries and for location-aware services The RF path is bent Modeling indoor RF propagation is complex as it passes through an object Reflection AI for indoor positioning RF bounces off surfaces Learn complex physics of propagation (robot arm, car, wall) Enable highly accurate positioning 16 Neural unsupervised learning from RF for positioning Injecting domain knowledge for interpretability

Real Tx Virtual Tx

Virtual Tx locations (learnable) Real map Learned map 푝 푝 푝 푝 퐴 1 2 3 (CSI vs position) Unsupervised Line of Reflected path is sight also equivalent to Neural Receiver network location a direct path from mirror image Conventional physical (Virtual Tx) 퐶푆퐼 푃 퐶푆퐼෢ channel modeling Reflection from the wall “Neural augmentation”

Rx measures aggregate CSI from all paths Reflector e.g., a wall originating from Real Tx Forward pass Backward pass

Physics of reflections Neural augmentation Incredible results The receiver (Rx) collects unlabeled channel state info The neural network uses a generative auto- The neural network learns the virtual transmitter (CSI) observations throughout the environment. encoder plus conventional channel modeling locations up to isometries completely unsupervised. (based on physics of propagation) to train on the The goal is to learn the Virtual Tx locations and how to With a few labeled measurements, map ambiguity is observations and learn the environment. triangulate using CSI. resolved to achieve cm-level positioning.

17 18 AI RADAR Perceptive radar Research results Traditional radar is affordable and responsive, Significant improvements in position and size has a long range, measures velocity directly, and estimation, velocity estimation, object classification, isn’t compromised by lighting or weather conditions and uncertainty estimation Applying deep learning directly to the radar signal Robust performance even in complex scenarios improves virtually all existing radar capabilities

Complementary sensor Future research areas Each sensor has its own strengths and complements Radar compression, elevation estimation, drivable other sensors space, sparse radar sensing, pedestrian sensing, A car can see best when utilizing all of its sensors range extension, and adaptive sampling research together, otherwise known as sensor fusion Sensor fusion research

Paving the road to autonomous driving with more perceptive radar

19 AI radar detects occluded vehicles in complex scenarios Video shows the accuracy between AI radar and ground truth up to 94 meters

Performance in complex scenarios Robust performance even in crowded scenarios

20 Superposition Entanglement Quantum computing Each qubit is both Qubits that are inextricably linked Utilize quantum mechanics to 1 and 0 at the same time such that whatever happens to achieve exponential speedup one immediately affects the other

From Bayesian bits to quantum bits Through quantum computing, utilize quantum mechanics to achieve exponential speedup

Classical bit Bayesian bit Quantum bit (qubit) 0 1 0 1

21 Applying quantum mechanics to machine learning Fundamental green field research to utilize the exponential power of quantum computing on various use cases

Quantum annealing Quantum deep learning Combinatorial optimization problems are widespread, The statistics of quantum mechanics can apply to deep learning including physical design and architecture search Exploration of quantum-inspired classical algorithm Classical computing hits its limits for a large number of states

Quantum mechanics gives fast search Quantum binary networks are efficient solutions to combinatorial problems on classical devices

22 Quantum Measurement deformed binary Output Qubit (readout prob output)

O1 H neural networks ● † ● FT Running a classical neural ● Ot H network on quantum computer Transform

(t-1) X1 2 U1 U1 W1 Key ideas (t-1) X2 U 2 2 ●●● U2 Deform a classical neural network into a W2 quantum neural network with quantum gates ● ● Input ● ● ● ● Run on quantum computer or efficiently (t-1) simulate on classical computer Xn 2 Un Un Wn Initial result 98.7% accuracy on MNIST Weight Gates (unitary)

First quantum binary neural network for real data!

23 Advancing AI research to make power efficient AI ubiquitous — from device to cloud

Conducting leading research and development across the entire spectrum of AI

Creating an AI platform fundamental to scaling AI across the industry and applications

24 www.qualcomm.com/ai

www.qualcomm.com/news/onq

@qualcomm_tech Questions? http://www.youtube.com/playlist?list=PL8AD95E4F585237C1&feature=plcp Connect with Us

http://www.slideshare.net/qualcommwirelessevolution

25 Thank you

Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog

Nothing in these materials is an offer to sell any of the References in this presentation to “Qualcomm” may mean components or devices referenced herein. Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm ©2018-2020 Qualcomm Technologies, Inc. and/or its corporate structure, as applicable. Qualcomm Incorporated affiliated companies. All Rights Reserved. includes Qualcomm’s licensing business, QTL, and the vast Qualcomm, Snapdragon, and Snapdragon Ride are majority of its patent portfolio. Qualcomm Technologies, Inc., trademarks or registered trademarks of Qualcomm a wholly-owned subsidiary of Qualcomm Incorporated, operates, Incorporated. Other products and brand names may be along with its subsidiaries, substantially all of Qualcomm’s trademarks or registered trademarks of their respective engineering, research and development functions, and owners. substantially all of its product and services businesses, including its business, QCT.