Nvidia Tesla V100 Gpu Accelerator
Total Page:16
File Type:pdf, Size:1020Kb
NVIDIA TESLA V100 GPU ACCELERATOR The Most Advanced Data Center GPU Ever Built. SPECIFICATIONS NVIDIA® Tesla® V100 is the world’s most advanced data center GPU ever built to accelerate AI, HPC, and graphics. Powered by NVIDIA Volta, the latest GPU architecture, Tesla V100 offers the Tesla V100 Tesla V100 PCle SXM2 performance of up to 100 CPUs in a single GPU—enabling data GPU Architecture NVIDIA Volta scientists, researchers, and engineers to tackle challenges that NVIDIA Tensor 640 were once thought impossible. Cores NVIDIA CUDA® 5,120 Cores 47X Higher Throughput than CPU Deep Learning Training in Less Double-Precision 7 TFLOPS 7.8 TFLOPS Server on Deep Learning Inference Than a Workday Performance Single-Precision 14 TFLOPS 15.7 TFLOPS 8X V100 Performance Tesla V100 47X 51 Hours Tensor 112 TFLOPS 125 TFLOPS Tesla P100 15X Performance 1X PU 8X P100 GPU Memory 32GB /16GB HBM2 155 Hours Memory 900GB/sec 0 10X 20X 30X 40X 50X 0 4 8 12 16 Bandwidth Performance Normalzed to PU Tme to Soluton n Hours Lower s Better ECC Yes Workload ResNet-50 | PU 1X Xeon E5-2690v4 26Hz | PU add 1X NVIDIA® Server ong Dual Xeon E5-2699 v4 26Hz | 8X Interconnect Tesla ® P100 or V100 Tesla P100 or V100 | ResNet-50 Tranng on MXNet 32GB/sec 300GB/sec for 90 Epochs wth 128M ImageNet dataset Bandwidth System Interface PCIe Gen3 NVIDIA NVLink Form Factor PCIe Full SXM2 Height/Length 1 GPU Node Replaces Up To 54 CPU Nodes Node Replacement: HPC Mixed Workload Max Power 250 W 300 W Comsumption Lfe Scence 14 Thermal Solution Passive (NAMD) Compute APIs CUDA, DirectCompute, Physcs 17 (T) OpenCL™, OpenACC Physcs 32 (MIL) eo Scence 54 (SPEFEM3D) 0 20 40 60 # of PU-Only Nodes PU Server Dual Xeon old 6140 230Hz, PU Servers same PU server w/ 4x V100 PIe | UDA Verson UDA 9x| Dataset NAMD (STMV), T (mp#procn), MIL (APEX Medum), SPEFEM3D (four_materal_smple_model ) | To arrve at PU node equvalence, we use measured benchmark wth up to 8 PU nodes Then we use lnear scalng to scale beyond 8 nodes TESLA V100 | Data SHeet | MAR18 GROUNDBREAKING INNOVATIONS VOLTA ARCHITECTURE TENSOR CORE NEXT GENERATION NVLINK By pairing CUDA Cores and Equipped with 640 Tensor NVIDIA NVLink in Tesla V100 Tensor Cores within a unified Cores, Tesla V100 delivers 125 delivers 2X higher throughput architecture, a single server teraFLOPS of deep learning compared to the previous with Tesla V100 GPUs can performance. That’s 12X Tensor generation. Up to eight replace hundreds of commodity FLOPS for DL Training, and 6X Tesla V100 accelerators can CPU servers for traditional HPC Tensor FLOPS for DL Inference be interconnected at up to and Deep Learning. when compared to NVIDIA 300GB/s to unleash the highest Pascal™ GPUs. application performance possible on a single server. MAXIMUM HBM2 PROGRAMMABILITY EFFICIENCY MODE With a combination of improved Tesla V100 is architected The new maximum efficiency raw bandwidth of 900GB/s from the ground up to C mode allows data centers and higher DRAM utilization simplify programmability. to achieve up to 40% higher efficiency at 95%, Tesla V100 Its new independent thread compute capacity per rack delivers 1.5X higher memory scheduling enables finer-grain within the existing power bandwidth over Pascal GPUs synchronization and improves budget. In this mode, Tesla as measured on STREAM. GPU utilization by sharing V100 runs at peak processing Tesla V100 is now available resources among small jobs. efficiency, providing up to 80% in a 32GB configuration that of the performance at half the doubles the memory of the power consumption. standard 16GB offering. Tesla V100 is the flagship product of Tesla data center computing platform for deep learning, HPC, and graphics. The Tesla platform accelerates over 550 HPC applications and every major deep learning framework. It is available everywhere from desktops to servers to cloud services, delivering both dramatic performance gains and cost savings opportunities. EVERY DEEP LEARNING FRAMEWORK 550+ GPU-ACCELERATED APPLICATIONS HPC AMBER HPC ANSYS Fluent HPC GAUSSIAN HPC GROMACS HPC LS-DYNA HPC NAMD HPC OpenFOAM HPC Simulia Abaqus HPC VASP HPC WRF To learn more about the Tesla V100 visit www.nvidia.com/v100 © 2018 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, Tesla, NVIDIA GPU Boost, CUDA, and NVIDIA Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. OpenCL is a trademark of Apple Inc. used under license to the Khronos Group Inc. All other trademarks and copyrights are the property of their respective owners. MAR18.