Pushing the Boundaries of AI Research Qualcomm Technologies, Inc

September 2, 2020 @qualcomm_tech Pushing the boundaries of AI research Qualcomm Technologies, Inc. Our AI leadership Over a decade of cutting-edge AI R&D, speeding up commercialization and enabling scale Research face MWC demo showcasing Collaboration Brain Corp Qualcomm® Power efficiency Mobile AI Enablement Center detection with photo sorting and with Google on raises $114M Artificial Intelligence gains through compression, in Taiwan to open deep learning handwriting recognition TensorFlow Research initiated quantization, Deep-learning and compilation based AlexNet wins ImageNet competition Research in spiking Research artificial neural Opened Qualcomm Announced Qualcomm Gauge neural networks processing architectures Research Netherlands Facebook Caffe2 Technologies ships equivariant support ONNX supported CNNs by Microsoft, Facebook, Amazon 2009 2013 2015 2016 2017 2018 2019 2020 Acquired EuVision Qualcomm 2007 Opened joint Technologies Investment and research lab researchers AI Model Efficiency Qualcomm Research collaboration with Completed Brain Corp with University Acquired win best paper Toolkit (AIMET) initiates first AI project Brain Corp joint research of Amsterdam Scyfer at ICLR open sourced Qualcomm® Qualcomm® 3rd Gen Neural Vision Snapdragon Processing Snapdragon 660 Intelligence Automotive Snapdragon Qualcomm® Snapdragon Consistent AI R&D SDK Snapdragon 630 Platform Cockpit 665, 730, 730G Ride™ Platform investment is the foundation for product leadership 2nd Gen 3rd Gen 4th Gen Qualcomm 5th Gen Qualcomm AI Engine Qualcomm AI Engine Snapdragon AI Engine Qualcomm® Qualcomm AI Engine (Snapdragon 835) (Snapdragon 845) 710 (Snapdragon 855) Cloud AI 100 (Snapdragon 865) Qualcomm Artificial Intelligence Research is an initiative of Qualcomm Technologies, Inc. Qualcomm Snapdragon, Qualcomm Neural Processing SDK, Qualcomm Vision Intelligence 1st Gen Qualcomm® AI Engine Qualcomm® QCS400 Qualcomm® Robotics Platform, Qualcomm AI Engine, Qualcomm Cloud AI, Qualcomm Snapdragon Ride, Qualcomm (Qualcomm® Snapdragon™ 820 Mobile Platform) (First audio SoC) RB5 Platform Robotics RB3 Platform, and Qualcomm QCS400I are products of Qualcomm Technologies, Inc. and/or its subsidiaries. AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc. 3 Advancing AI Perception Object detection, speech IoT/IIoT research to make recognition, contextual fusion efficient AI ubiquitous Edge cloud Power efficiency Personalization Efficient learning Reasoning Model design, Continuous learning, Robust learning Scene understanding, language Automotive compression, quantization, contextual, always-on, through minimal data, understanding, behavior prediction algorithms, efficient privacy-preserved, unsupervised learning, hardware, software tool distributed learning on-device learning A platform to scale AI Action Reinforcement learning Cloud across the industry for decision making Mobile 4 Leading research and development Across the entire spectrum of AI Fundamental Applied research research Deep Neural Graph and Deep AI Model G-CNN transfer network kernel learning for Efficiency Toolkit learning compression optimization graphics Bayesian Deep Neural Compute Source CV DL for combinatorial generative network Voice UI in memory compression new sensors optimization models quantization Bayesian Hybrid Hardware- Video Power distributed reinforcement aware Fingerprint recognition management learning learning deep learning and prediction AI for Energy-efficient AI for Quantum AI wireless perception radar 5 Trained neural network model New Inference input data output Quantization Automated reduction in precision of weights Promising results show that and activations while maintaining accuracy low-precision integer inference can become widespread Models trained at Inference at Virtually the same accuracy between a high precision lower precision FP32 and quantized AI model through: Increase in performance 01010101 01010101 01010101 01010101 01010101 per watt from savings in • Automated, data free, post-training methods > 1 4x memory and compute • Automated training-based mixed-precision method 32-bit Floating point 8-bit Integer 3452.3194 3452 1: FP32 model compared to a INT8 quantized model Leading research to efficiently quantize AI models 6 Pushing the limits of what’s possible with quantization Data-free quantization AdaRound Bayesian bits How can we make quantization Is rounding to the nearest value the best Can we quantize layers to different bit widths based on as simple as possible? approach for quantization? precision sensitivity? Created an automated method that addresses Created an automated method for finding Created a novel method to learn bias and imbalance in weight ranges: the best rounding choice: mixed-precision quantization: No training No training Training required Data free Minimal unlabeled data Training data required Jointly learns bit-width precision and pruning SOTA 8-bit results SOTA 4-bit weight results SOTA mixed-precision results Making 8-bit weight quantization ubiquitous Making 4-bit weight quantization ubiquitous Automating mixed-precision quantization and enabling the tradeoff between accuracy and kernel bit-width Increase in performance per watt Increase in performance per watt Increase in performance per watt while only losing 0.5% of accuracy while only losing 2.5% of accuracy while only losing 0.8% of accuracy >4x against FP32 MobileNet V2 >8x against FP32 MobileNet V2 >8x against FP32 MobileNet V2 Data-Free Quantization Through Weight Equalization and Up or Down? Adaptive Rounding for Post-Training Quantization Bayesian Bits: Unifying Quantization and Pruning Bias Correction (Nagel, van Baalen, et al., ICCV 2019) (Nagel, Amjad, et al., ICML 2020) van Baalen, Louizos, et al., arXiv 2020) 7 Leading quantization research and fast commercialization Driving the industry towards integer inference and power-efficient AI Quantization Relaxed Quantization Data-free Quantization AdaRound Bayesian Bits research (ICLR 2019) (ICCV 2019) (ICML 2020) (arXiv 2020) Quantization commercialization Qualcomm Neural Processing SDK Quantization open-sourcing AI Model Efficiency Toolkit (AIMET) Qualcomm Neural Processing SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries. AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc. 8 AIMET makes AI models small Open-sourced GitHub project that includes state-of-the-art quantization and compression techniques from Qualcomm AI Research Trained AI Model Efficiency Toolkit Optimized AI model (AIMET) AI model Compression Quantization TensorFlow or PyTorch If interested, please join the AIMET GitHub project: https://github.com/quic/aimet Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc. AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc. State-of-the-art Features: State-of-the-art Support for both Benchmarks and tests Developed by network quantization tools TensorFlow for many models professional software compression tools and PyTorch developers 9 An increasing demand Developing AI approaches for energy efficient to reduce unnecessary video processing computation Valuable information is being extracted from video Recognizing redundancy between video frames streams across diverse devices and use cases so that the same thing is never computed twice Enhanced perception through object Advanced camera features, like video detection and semantic segmentation enhancement and super-resolution Feature maps over time for ResNet18 remain mostly constant Advanced video understanding, Increased video compression to Tremendous redundancy across frames! like search and surveillance address the demand for rich media 10 Conditional computation using gated networks Introduce gates with low computation cost to skip unnecessary computation in neural networks Problem State-of-the-art accuracy while reducing A large portion of the neural network is computation by up to 3X not necessary for the prediction, wasting computations 0.76 Conv + bn + Conv Solution bn + Conv 0.74 + Conditional channel-gated networks for task-aware continual learning 0.72 • Dynamically select the filters 0.70 conditioned on the task and input G ResNet baseline (18,34,50) • Automatically infer the task from 0.68 Convnet-FBS18 gating patterns Convnet-AIG50 1 Accuracy 1 - 0.66 Convnet-SIG34 Gating module Top Convnet-AIG18 0.64 ResNet50-BAS (ours) 퐴푣푒. 푝표표푙푖푛푔 2 푙푎푦푒푟 푀퐿푃 ResNet34-BAS (ours) + 0.62 ResNet18-BAS (ours) 퐻 × 푊 × 퐶 0.60 1 1.5 2 2.5 3 3.5 4 Input feature map 1 × 1 × 퐶 1 × 1 × 퐶′ 1 × 1 × 퐶′ MAC (x109) 1. Bejnordi, Babak Ehteshami, Tijmen Blankevoort, and Max Welling. "Batch shaping for learning conditional channel gated networks." ICLR (2020). 2. Abati, Davide, Jakub Tomczak, Tijmen Blankevoort, Simone Calderara, Rita Cucchiara, and Babak Ehteshami Bejnordi. "Conditional Channel Gated Networks for Task-Aware Continual Learning." CVPR (2020). 11 Generative Powerful Broad models capabilities applications Variational auto Extract features by Speech/video compression Deep generative encoder (VAE)* learning a low-dimension model research feature representation Text to speech for unsupervised Generative adversarial Sampling to generate, learning network (GAN) restore, predict, or compress data Graphics rendering Given unlabeled training data, generate new samples from the Auto-regressive same distribution Computational photography Invertible Voice UI * VAE first introduced by D. Kingma and M. Welling in 2013 12 Input unlabeled Example use case: data,

Pushing the Boundaries of AI Research Qualcomm Technologies, Inc

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support