CUDA Samples

CUDA Samples

CUDA Samples Reference Manual TRM-06704-001_v11.4 | September 2021 Table of Contents Chapter 1. Release Notes.................................................................................................... 1 1.1. CUDA 11.4..................................................................................................................................1 1.2. CUDA 11.3..................................................................................................................................1 1.3. CUDA 11.2..................................................................................................................................1 1.4. CUDA 11.1..................................................................................................................................2 1.5. CUDA 11.0..................................................................................................................................2 1.6. CUDA 10.2..................................................................................................................................3 1.7. CUDA 10.1 Update 2................................................................................................................. 3 1.8. CUDA 10.1 Update 1................................................................................................................. 4 1.9. CUDA 10.1..................................................................................................................................4 1.10. CUDA 10.0................................................................................................................................4 1.11. CUDA 9.2..................................................................................................................................5 1.12. CUDA 9.0..................................................................................................................................5 1.13. CUDA 8.0..................................................................................................................................5 1.14. CUDA 7.5..................................................................................................................................6 1.15. CUDA 7.0..................................................................................................................................7 1.16. CUDA 6.5..................................................................................................................................8 1.17. CUDA 6.0..................................................................................................................................8 1.18. CUDA 5.5..................................................................................................................................9 1.19. CUDA 5.0..................................................................................................................................9 1.20. CUDA 4.2................................................................................................................................10 1.21. CUDA 4.1................................................................................................................................10 Chapter 2. Getting Started................................................................................................. 11 2.1. Getting CUDA Samples...........................................................................................................11 Windows...................................................................................................................................... 11 Linux............................................................................................................................................ 11 2.2. Building Samples.................................................................................................................... 11 Windows...................................................................................................................................... 11 Linux............................................................................................................................................ 12 2.3. CUDA Cross-Platform Samples.............................................................................................12 TARGET_ARCH............................................................................................................................13 TARGET_OS................................................................................................................................. 13 TARGET_FS................................................................................................................................. 13 Cross Compiling to Embedded ARM architectures..............................................................13 Copying Libraries.................................................................................................................... 13 CUDA Samples TRM-06704-001_v11.4 | ii 2.4. Using CUDA Samples to Create Your Own CUDA Projects.................................................. 14 2.4.1. Creating CUDA Projects for Windows.............................................................................14 2.4.2. Creating CUDA Projects for Linux...................................................................................14 Chapter 3. Samples Reference.......................................................................................... 16 3.1. Simple Reference....................................................................................................................16 asyncAPI...................................................................................................................................... 16 bf16TensorCoreGemm - bfloat16 Tensor Core GEMM............................................................ 17 binaryPartitionCG - Binary Partition Cooperative Groups........................................................17 cdpSimplePrint - Simple Print (CUDA Dynamic Parallelism)..................................................18 cdpSimpleQuicksort - Simple Quicksort (CUDA Dynamic Parallelism).................................. 18 clock - Clock...............................................................................................................................18 clock_nvrtc - Clock libNVRTC................................................................................................... 19 cppIntegration - C++ Integration............................................................................................... 19 cppOverload.................................................................................................................................19 cudaNvSci - CUDA NvSciBuf/NvSciSync Interop......................................................................20 cudaOpenMP............................................................................................................................... 20 cudaTensorCoreGemm - CUDA Tensor Core GEMM...............................................................21 dmmaTensorCoreGemm - Double Precision Tensor Core GEMM..........................................21 fp16ScalarProduct - FP16 Scalar Product............................................................................... 22 globalToShmemAsyncCopy - Global Memory to Shared Memory Async Copy....................... 22 immaTensorCoreGemm - Tensor Core GEMM Integer MMA..................................................23 inlinePTX - Using Inline PTX......................................................................................................23 inlinePTX_nvrtc - Using Inline PTX with libNVRTC...................................................................23 matrixMul - Matrix Multiplication (CUDA Runtime API Version)............................................. 24 matrixMul_nvrtc - Matrix Multiplication with libNVRTC...........................................................24 matrixMulCUBLAS - Matrix Multiplication (CUBLAS).............................................................. 25 matrixMulDrv - Matrix Multiplication (CUDA Driver API Version)............................................25 memMapIPCDrv - Memmap IPC Driver API.............................................................................26 simpleAssert............................................................................................................................... 26 simpleAssert_nvrtc - simpleAssert with libNVRTC..................................................................27 simpleAtomicIntrinsics - Simple Atomic Intrinsics..................................................................27 simpleAtomicIntrinsics_nvrtc - Simple Atomic Intrinsics with libNVRTC............................... 27 simpleAttributes..........................................................................................................................28 simpleAWBarrier - Simple Arrive Wait Barrier........................................................................28 simpleCallback - Simple CUDA Callbacks............................................................................... 29 simpleCooperativeGroups - Simple Cooperative Groups.........................................................29 simpleCubemapTexture - Simple Cubemap Texture............................................................... 29 simpleCudaGraphs - Simple CUDA Graphs............................................................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    145 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us