The Ultimate CUDA Development GPU 1 Introducing Geforce GTX TITAN the Ultimate CUDA Development GPU

The Ultimate CUDA Development GPU 1 Introducing Geforce GTX TITAN the Ultimate CUDA Development GPU

The Ultimate CUDA Development GPU 1 Introducing GeForce GTX TITAN The Ultimate CUDA Development GPU 2688 4.5 1.27 288 CUDA Cores Teraflops Single Precision Teraflops Double Precision GB/s Memory Bandwidth 2 GTX TITAN Personal Supercomputer on Your Desktop 1 Teraflop < $1000 Develop Anywhere Deploy on Cluster Ease of Programming with Develop on GTX Titan New Kepler Architecture 3 The Best of Kepler in a PC GFLOPS Peak Double Precision Dynamic Parallelism 1500 1250 1000 750 500 250 0 Core i7-3970X GTX 680 GTX TITAN Boosts PC with 8x More Performance More Science, Less Coding Dynamic Parallelism Makes Parallel Programming Easier Quicksort No complex CPU & GPU interaction Before Kepler Easier code in half the lines Easier porting for existing codes With Kepler 2x More Applications, More Customers Irregular Work Adaptive Mesh (CFD) N-Body Tree Codes (Astrophysics) Sparse Matrix Video Transcoding Algebraic Multigrid Kepler Divide and Conquer Fermi (Big Data) Direct N-body Burrows-Wheeler Aligner Structured Monte Carlo (Bioinformatics) Work Reverse Time Migration Data Task Parallel Parallel Comparing GTX TITAN and Tesla K20X Features GeForce GTX TITAN Tesla K20X 837MHz/3GHz Core/Mem clock 732MHz/2.6GHz (clocks may vary when double precision is on) Peak Single Precision ~4.5 Tflops 3.95 TFlops Peak Double Precision ~1.27 Tflops (estimate) 1.32 TFlops Memory size 6 GB 6 GB Memory BW (ECC off) 288 GB/s 250 GB/s Gen 3 only on Ivy Bridge PCIe Gen 2 Gen 2 on Sandy Bridge Dynamic Parallelism, Hyper-Q For CUDA Streams Dynamic Parallelism, Hyper-Q Proxy for MPI and CUDA Features GPUDirect Peer to Peer CUDA Streams, GPUDirect Peer to Peer, and RDMA GPU monitoring None NVML/NVSMI, OOB, InfoROM, NVHealthmon, TCC Cluster monitoring None Bright Computing, Ganglia ECC Features No ECC DRAM, Internal Caches & Reg Files Total Board Power 250W 235W Tesla Advantage: Built for Deployment Built for HPC Reliability • Integrated solutions & support from OEMs and channel Performance • ECC protection • Tools for GPU Management • Tested to run 24/7 with real- (Nvhealthmon, nvsmi/nvml) • Fastest DP of 1.31TFLOPS on world workloads Tesla K20X • Tools for Cluster Management • 3 year warranty and support for • Optimized for Infiniband with bugs/feature requests • Enterprise OS support NVIDIA GPUDirect™ • ISVs certify only on Tesla • Solution expertise provided by • Hyper-Q for accelerating MPI CUDA engineers and technical based workloads • NVIDIA technical support staff • Tuning and optimization • Longer life cycle for continuity • Designed, tested and optimized support from NVIDIA experts and cluster expansion for cluster deployment Develop with GeForce, Deploy with Tesla Call to Action • Promote GTX Titan with OpenACC to new developers • Increases # of GPU enabled workloads • Continue recommending Tesla for workloads in production environment • Tesla Advantage: Performance + Reliability + Built for HPC • Refer to GTX Titan and Tesla FAQ for differentiation • GPUDirect, Hyper-Q, GPUBoost. GPUMonitoring Thank You .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    11 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us