Jen-Hsun Huang, Co-Founder & Ceo, Nvidia | Gtc China 2016
Total Page:16
File Type:pdf, Size:1020Kb
THE DEEP LEARNING AI REVOLUTION JEN-HSUN HUANG, CO-FOUNDER & CEO, NVIDIA | GTC CHINA 2016 GPU DEEP LEARNING BIG BANG ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey e. Hinton University of Toronto University of Toronto University of Toronto NIPS (2012) Deep Learning NVIDIA GPU GPU DEEP LEARNING ACHIEVES “SUPERHUMAN” RESULTS ImageNet — Accuracy % 96% Human Microsoft, Google 3.5% error rate DL 74% Hand-coded CV 2010 2011 2012 2013 2014 2015 2012: Deep Learning researchers 2015: DNN achieves superhuman 2015: Deep Speech 2 achieves worldwide discover GPUs image recognition superhuman voice recognition NVIDIA — “THE AI COMPUTING COMPANY” GPU Computing Computer Graphics Artificial Intelligence ANNOUNCING NEW GRAPHICS SDKS Ansel Volumetric OptiX 4.0 Mental Ray MDL 1.0 In-game Photography Physical Light Models Multi-GPU Ray-Tracing Now GPU-Accelerated! Physically Based Materials Funhouse VR 360 Video 1.0 Iray VR Remote Rendering GVDB Open Source Real-Time Panoramic VR Photorealistic VR Ray Tracing Video Compositing Sparse Volumes for Special Effects NVIDIA VR FUNHOUSE NVIDIA SILICON VALLEY HEADQUARTERS GTC — 25X GROWTH IN GPU DL DEVELOPERS s 16,000 400,000 55,000 • Higher Ed 35% • Australia • Software 19% • China • Internet 15% • Europe • Auto 10% • India 120,000 • Government 5% • Japan • Medical 4% • Japan 3,700 • Korea 2,200 • Finance 4% • United States • United States • Manufacturing 4% (Silicon Valley, D.C.) 2014 2016 2014 2016 2014 2016 4X Attendees 3X GPU Developers 25x Deep Learning Developers WHY DID AI RESEARCHERS ADOPT GPUs FOR DEEP LEARNING? BRAIN IS LIKE A GPU ?? BRAIN CREATES MENTAL IMAGES WHEN WE THINK GPU IS LIKE A BRAIN GPU DEEP LEARNING IS A NEW COMPUTING MODEL Training Datacenter Device GPU DEEP LEARNING IS A NEW COMPUTING MODEL Billions of Trillions of Operations GPU train larger models, accelerate time to market Training Datacenter TRAINING Device GPU DEEP LEARNING IS A NEW COMPUTING MODEL 10s of billions of image, voice, video queries per day Training Datacenter GPU inference for fast response, maximize datacenter throughput DATACENTER INFERENCING Device GPU DEEP LEARNING IS A NEW COMPUTING MODEL Training Datacenter Billions of intelligent devices GPU for real-time accurate response Device DEVICE INFERENCING AI — THE ULTIMATE COMPUTING CHALLENGE IMAGE RECOGNITION SPEECH RECOGNITION Important Property of Neural Networks 20 ExaFLOPS Results get better with 16X 10X Model 152 layers Training Ops 22.6 GFLOP/image 100M | 12,000 Hours more data + ~3.5% error ~5% Error bigger models + 8 layers 2 ExaFLOPS 1.4 GFLOP/image more computation ~16% Error 25M | 7,000 Hours ~8% Error (Better algorithms, new insights and 2012 2015 2014 2015 improved techniques always help, too!) AlexNet ResNet Deep Speech 1 Deep Speech 2 PASCAL “5 MIRACLES” BOOST DEEP LEARNING 65X 70X Pascal 60X 50X 40X 30X PaddlePaddle 20X Baidu Deep Learning Maxwell 10X Kepler X 2013 2014 2015 2016 Pascal 16nm FinFET CoWoS HBM2 NVLink cuDNN Pascal — 5 Miracles NVIDIA DGX-1 Supercomputer 65X in 4 yrs Accelerate Every Framework Chart: Relative speed-up of images/sec vs K40 in 2013. AlexNet training throughput based on 20 iterations. CPU: 1x E5-2680v3 12 Core 2.5GHz. 128GB System Memory, Ubuntu 14.04. M40 datapoint: 8x M40 GPUs in a node P100: 8x P100 NVLink-enabled. ANNOUNCING NEW IBM SERVER POWER8 + NVIDIA TESLA P100 FOR THE AI ENTERPRISE “Putting NVIDIA’s technology into the IBM system will speed up performance for such emerging workloads as AI, deep learning and data analytics.” — eWeek Andrew Ng, Chief Scientist Training Datacenter Device ANNOUNCING TESLA P4 & P40 INFERENCING ACCELERATORS Pascal Architecture | INT8 P40: 250W | 40X Energy Efficient versus CPU P40: 250W | 40X Performance versus CPU ANNOUNCING TensorRT PERFORMANCE OPTIMIZING INFERENCING ENGINE FP32, FP16, INT8 | Vertical & Horizontal Fusion | Auto-Tuning VGG, GoogLeNet, ResNet, AlexNet & Custom Layers Available Today: developer.nvidia.com/tensorrt Alibaba/Aliyun Amazon Baidu eBay Facebook Flickr Google iFLYTEK NVIDIA GPU DEEP LEARNING iQIYI JD.com Microsoft Netflix Orange Periscope Pinterest Qihoo 360 EVERYWHERE Shazam Skype Sogou Tencent Twitter Yahoo Supermarket Yandex Yelp >1,500 AI STARTUPS AROUND THE WORLD Deep Learning for Cybersecurity Deep Learning for Genomics Deep Learning for Self-Driving Cars Deep Learning for Art AI STARTUPS IN CHINA Weather & Environment Eye-tracking for Human- Medical Face Product Recognition, Personal Forecast machine Interaction Imaging Recognition Detection, Search Concierge App Training Datacenter Device “BILLIONS OF INTELLIGENT DEVICES” “Billions of intelligent devices will take advantage of DNNs to provide personalization and localization as GPUs become faster and faster over the next several years.” — Tractica AI CITY — 1B CAMERAS BY 2020 ~1 billion cameras worldwide by 2020 30 billion inferences/sec Tesla P40: 2,500 inferences/sec @ 720P AI City needs ~10M P40 servers DATA: 1B cameras, IHS “Video Surveillance Intelligence Service, Aug. 2016” 1/20TH THE SPACE, 1/10TH THE POWER ~21 1U Servers 42 CPUs ~4,000 W 1 Hikvision Blade 16 TX1 + 1 CPU >8 1080 streams ~300 W Traditional Server Hikvision Blade NVIDIA DGX-1 Hikvision Blade 16 Jetson TX1s ANNOUNCING NVIDIA AI CITY PARTNERS AI TRANSPORTATION — $10T INDUSTRY PERCEPTION AI PERCEPTION AI LOCALIZATION DRIVING AI DEEP LEARNING FREE SPACE DETECTION CAR 3D DETECTION NVIDIA BB8 AI CAR NVIDIA DRIVE PX 2 AUTOCRUISE TO FULL AUTONOMY Full Autonomy ONE ARCHITECTURE AUTONOMOUS DRIVING AutoChauffeur Perception, Reasoning, Driving AI Supercomputing, AI Algorithms, Software AutoCruise Scalable Architecture NVIDIA DRIVE PX 2 AUTOCRUISE 10W AI Car Computer | Passive Cooling | Automotive IO AI Highway Driving | Localization & Mapping NVIDIA & BAIDU PARTNER ON AI SELF-DRIVING CARS NVIDIA AI SELF-DRIVING CARS IN DEVELOPMENT Baidu nuTonomy Volvo WEpods NVIDIA NVIDIA END-TO-END DEEP LEARNING PLATFORM PaddlePaddle Baidu Deep Learning TESLA P100 DGX-1 TRAINING NVIDIA END-TO-END DEEP LEARNING PLATFORM PaddlePaddle Baidu Deep Learning ANNOUNCING TensorRT TESLA P100 DGX-1 ANNOUNCING TESLA P4 & P40 TRAINING DATACENTER INFERENCING NVIDIA END-TO-END DEEP LEARNING PLATFORM CUDA PaddlePaddle JETPACK DRIVEWORKS Baidu Deep Learning ANNOUNCING TensorRT ANNOUNCING TESLA P100 DGX-1 ANNOUNCING TESLA P4 & P40 JETSON TX1 DRIVE PX 2 AUTOCRUISE TRAINING DATACENTER INFERENCING INTELLIGENT DEVICES NVIDIA DEEP LEARNING PLATFORM PARTNERS AI ENTERPRISE AI CITY AI CAR AI FOR EVERYONE AI will Revolutionize Transportation AI will Revolutionize Healthcare AI will Revolutionize Society .