<<

The Ultimate CUDA Development GPU 1 Introducing GeForce GTX TITAN The Ultimate CUDA Development GPU

2688 4.5 1.27 288 CUDA Cores Teraflops Single Precision Teraflops Double Precision GB/s Memory Bandwidth 2 GTX TITAN Personal on Your Desktop

1 Teraflop < $1000

Develop Anywhere

Deploy on Cluster Ease of Programming with

Develop on GTX Titan New Kepler Architecture

3 The Best of Kepler in a PC

GFLOPS Peak Double Precision Dynamic Parallelism 1500

1250

1000

750

500

250

0 Core i7-3970X GTX 680 GTX TITAN

Boosts PC with 8x More Performance More Science, Less Coding Dynamic Parallelism Makes Parallel Programming Easier Quicksort No complex CPU & GPU interaction

Before Kepler Easier code in half the lines

Easier porting for existing codes

With Kepler 2x More Applications, More Customers

Irregular Work Adaptive Mesh (CFD)

N-Body Tree Codes () Sparse Matrix Video Transcoding Algebraic Multigrid Kepler Divide and Conquer Fermi (Big Data) Direct N-body Burrows-Wheeler Aligner Structured Monte Carlo (Bioinformatics) Work Reverse Time Migration

Data Task Parallel Parallel Comparing GTX TITAN and Tesla K20X

Features GeForce GTX TITAN Tesla K20X 837MHz/3GHz Core/Mem clock 732MHz/2.6GHz (clocks may vary when double precision is on) Peak Single Precision ~4.5 Tflops 3.95 TFlops Peak Double Precision ~1.27 Tflops (estimate) 1.32 TFlops Memory size 6 GB 6 GB Memory BW (ECC off) 288 GB/s 250 GB/s Gen 3 only on Ivy Bridge PCIe Gen 2 Gen 2 on Sandy Bridge Dynamic Parallelism, Hyper-Q For CUDA Streams Dynamic Parallelism, Hyper-Q Proxy for MPI and CUDA Features GPUDirect Peer to Peer CUDA Streams, GPUDirect Peer to Peer, and RDMA GPU monitoring None NVML/NVSMI, OOB, InfoROM, NVHealthmon, TCC Cluster monitoring None Bright Computing, Ganglia ECC Features No ECC DRAM, Internal Caches & Reg Files Total Board Power 250W 235W Tesla Advantage: Built for Deployment

Built for HPC

Reliability • Integrated solutions & support from OEMs and channel Performance • ECC protection • Tools for GPU Management • Tested to run 24/7 with real- (Nvhealthmon, nvsmi/nvml) • Fastest DP of 1.31TFLOPS on world workloads Tesla K20X • Tools for Cluster Management

• 3 year warranty and support for • Optimized for Infiniband with bugs/feature requests • Enterprise OS support GPUDirect™

• ISVs certify only on Tesla • Solution expertise provided by • Hyper-Q for accelerating MPI CUDA engineers and technical based workloads • NVIDIA technical support staff

• Tuning and optimization • Longer life cycle for continuity • Designed, tested and optimized support from NVIDIA experts and cluster expansion for cluster deployment Develop with GeForce, Deploy with Tesla Call to Action

• Promote GTX Titan with OpenACC to new developers • Increases # of GPU enabled workloads

• Continue recommending Tesla for workloads in production environment • Tesla Advantage: Performance + Reliability + Built for HPC

• Refer to GTX Titan and Tesla FAQ for differentiation • GPUDirect, Hyper-Q, GPUBoost. GPUMonitoring

Thank You