TESLA V100 GPU ACCELERATOR

The Most Advanced Data Center GPU Ever Built. SPECIFICATIONS NVIDIA® Tesla® V100 is the world’s most advanced data center GPU ever built to accelerate AI, HPC, and graphics. Powered by NVIDIA Volta™, the latest GPU architecture, Tesla V100 offers the Tesla V100 Tesla V100 PCle SXM2 performance of up to 100 CPUs in a single GPU—enabling data GPU Architecture NVIDIA Volta scientists, researchers, and engineers to tackle challenges that NVIDIA Tensor 640 were once thought impossible. Cores NVIDIA CUDA® 5,120 Cores Double-Precision 30x Higher Throughput than CPU Deep Learning Training 7 TFLOPS 7.5 TFLOPS in One Workday Performance Server on Deep Learning Inference 8X V100 Single-Precision 7.4 Hours 14 TFLOPS 15 TFLOPS Performance Tesla V100 8X P100 18 Hours Tensor 8X K80 112 TFLOPS 120 TFLOPS Tesla P100 44 Hours Performance 2X CPU GPU Memory 16 GB HBM2 0 10 20 30 40 50 Time to Solution in Hours Memory 0 10X 20X 30X 40X - Lower is Better 900 GB/sec Performance Normalized to CPU Bandwidth Server Config: Dual E5-2699 v4, 2.6GHz | Workload: ResNet-50 | CPU: 2X Xeon 8x Tesla K80, Tesla P100 or Tesla V100 | V100 ECC Yes E5-2690v4 @ 2.6GHz | GPU: add 1X NVIDIA® performance measured on pre-production Tesla® P100 or V100 at 150W | V100 measured hardware. | ResNet-50 Training on Microsoft Interconnect on pre-production hardware. Cognitive Toolkit for 90 Epochs with 1.28M 32 GB/sec 300 GB/sec ImageNet dataset Bandwidth* System Interface PCIe Gen3 NVIDIA NVLink Form Factor PCIe Full SXM2 1.5X HPC Performance in One Year Height/Length Max Power 2.0X 250 W 300 W Comsumption d

e Thermal Solution Passive z i l a

m Compute APIs CUDA, DirectCompute, r 0 o ™ 0

N OpenCL , OpenACC 1 e

P 1.0X c o n t a m r o f r e P

0 STREAM Physics Seismic cuFFT (QUDA) (RTM)

CPU System: 2X Xeon E5-2690v4 @ 2.6GHz | GPU System: NVIDIA® Tesla® P100 or V100 | V100 measured on pre-production hardware TESLA V100 | Data Sheet | Jul17 GROUNDBREAKING INNOVATIONS

VOLTA ARCHITECTURE TENSOR CORE NEXT GENERATION NVLINK By pairing CUDA Cores and Equipped with 640 Tensor NVIDIA NVLink in Tesla V100 Tensor Cores within a unified Cores, Tesla V100 delivers 120 delivers 2X higher throughput architecture, a single server TeraFLOPS of deep learning compared to the previous with Tesla V100 GPUs can performance. That’s 12X Tensor generation. Up to eight Tesla replace hundreds of commodity FLOPS for DL Training, and 6X V100 accelerators can be CPU servers for traditional HPC Tensor FLOPS for DL Inference interconnected at up to 300 and Deep Learning. when compared to NVIDIA GB/s to unleash the highest Pascal™ GPUs. application performance possible on a single server.

MAXIMUM HBM2 PROGRAMMABILITY EFFICIENCY MODE With a combination of improved Tesla V100 is architected The new maximum efficiency raw bandwidth of 900 GB/s from the ground up to C mode allows data centers and higher DRAM utilization simplify programmability. to achieve up to 40% higher efficiency at 95%, Tesla V100 Its new independent thread compute capacity per rack delivers 1.5X higher memory scheduling enables finer-grain within the existing power bandwidth over Pascal GPUs synchronization and improves budget. In this mode, Tesla as measured on STREAM. GPU utilization by sharing V100 runs at peak processing resources among small jobs. efficiency, providing up to 80% of the performance at half the power consumption.

Tesla V100 is the flagship product of Tesla data center computing platform for deep learning, HPC, and graphics. The Tesla platform accelerates over 450 HPC applications and every major deep learning framework. It is available everywhere from desktops to servers to cloud services, delivering both dramatic performance gains and cost savings opportunities.

EVERY DEEP LEARNING FRAMEWORK 450+ GPU-ACCELERATED APPLICATIONS

HPC AMBER HPC ANSYS Fluent

HPC GAUSSIAN HPC GROMACS

HPC LS-DYNA HPC NAMD

HPC OpenFOAM HPC Simulia Abaqus

HPC VASP HPC WRF

To learn more about the Tesla V100 visit www.nvidia.com/v100

© 2017 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, Tesla, NVIDIA GPU Boost, CUDA, and NVIDIA Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. OpenCL is a trademark of Apple Inc. used under license to the Khronos Group Inc. All other trademarks and copyrights are the property of their respective owners. JUL17