GPU Accelerator and Co-Processor Capabilities

Total Page:16

File Type:pdf, Size:1020Kb

GPU Accelerator and Co-Processor Capabilities GPU Accelerator and co-processor Capabilities * Release 17.2 * Used in support of the CPU to process certain calculations and key solver computations for faster performance during a solution. - Acceleration can be used for both shared-memory parellel processing (shared-memory ANSYS) and distributed-memory parallel processing (Distributed ANSYS). - Acceleration is available for both Windows and Linux. Support by Application ANSYS Mechanical APDL supports all NVIDIA Tesla series cards and the following Quadro series cards: M6000, K6000. K5200, K5000. Additionally, ANSYS Mechanical APDL supports the following **Intel Xeon Phi co-processor cards: 3120, 5110 and 7120. ANSYS Fluent supports NVIDIA's CUDA-enabled Tesla and Quadro series workstation and server cards. ANSYS Polyflow supports all NVIDIA Tesla series cards, and the following Quadro series cards: M6000, K6000, K5200, K5000. ANSYS EMIT supports NVIDIA Tesla K-Series. ANSYS HFSS supports NVIDIA Tesla C20-Series, Tesla K-Series, Quadro K-Series (K5000 and above). ANSYS ICEPAK supports NVIDIA's CUDA-enabled Tesla and Quadro series workstation and server cards. ANSYS Maxwell supports NVIDIA Tesla C20-Series, Tesla K-Series, Quadro K-Series (K5000 and above). ANSYS Savant supports NVIDIA Tesla C20-Series, Tesla K-Series, Quadro K-Series, Quadro M-Series, and GeForce GTX Series. Tested Configurations - Check marks indicate software and hardware configurations successfully tested by ANSYS. Product Manufacturer Series Card / GPU Platform: Tested Operating System Version ANSYS APDL Mechanical ANSYS Fluent ANSYS Polyflow ANSYS EMIT ANSYS HFSS ANSYS ICEPAK ANSYS Maxwell ANSYS Savant Intel Xeon Phi Linux: P Red Hat 6.5, 6.6; SLES/SLED 11 3120 Windows: P Windows 7, 8.1, 10 Linux: P Red Hat 6.5, 6.6; SLES/SLED 11 5110 Windows: P Windows 7, 8.1, 10 Linux: P Red Hat 6.5, 6.6; SLES/SLED 11 7120 Windows: P Windows 7, 8.1, 10 Product Series Tested Operating System Version ANSYS Fluent ANSYS Polyflow ANSYS EMIT ANSYS HFSS ANSYS ICEPAK ANSYS Maxwell ANSYS Savant Manufacturer Card / GPU Platfoms: ANSYS APDL Mechanical NVIDIA Quadro Linux: 4800 Windows: P Linux: P 5800 Windows: Linux: P P Red Hat 6.5, 6.6, 6.7; SLES/SLED 11, 12 K5000 Windows: P P P P Windows 7, 8.1, 10 Linux: P K5200 Windows: P Windows 7, 8.1, 10; Red Hat 6.5, 6.6, 6.7; SLES/SLED 11, 12 Linux: P K6000 Windows: P Windows 7, 8.1, 10; Red Hat 6.5, 6.6, 6.7; SLES/SLED 11, 12 Linux: M4000 Windows: P Linux: M5000 Windows: P Linux: P Red Hat 6.5, 6.6, 6.7; SLES/SLED 11, 12 M6000 Windows: P Windows 7, 8.1 Tesla Linux: Red Hat 6.5, 6.6, 6.7; SLES/SLED 11, 12 C2075 Windows: P P Windows 7, 8.1, 10 Linux: M2075 Windows: P P Linux: P P P K20 Windows: P P P Linux: P P K20c Windows: P P P Linux: P P P K20Xm Windows: Product Series ANSYS ANSYS Savant ANSYS Fluent ANSYS Polyflow ANSYS EMIT ANSYS HFSS ANSYS ICEPAK ANSYS Maxwell Manufacturer Card / GPU Platform: ANSYS APDL Mechanical Tested Operating System Version Tesla Linux: P P K40 Winodws: P P P P Linux: P K40c Windows: P P Linux: P P P K40m Windows: P P Linux: P P P K80 Windows: P GeForce Linux: GTX 470 Windows: P Linux: GTX 580 Windows: P Linux: GTX 680 Windows: P Linux: GTX 750M Windows: P Linux: GTX 760M Windows: P Linux: GTX 780 Windows: P Linux: GTX 980 Windows: P Manufacturer Support: 16252 Intel: http://www.intel.com/content/www/us/en/homepage.html NIVIDA: http://www.nvidia.com/object/gpu-applications.html.
Recommended publications
  • Graphics Card Support List
    Graphics card support list Device Name Chipset ASUS GTXTITAN-6GD5 NVIDIA GeForce GTX TITAN ZOTAC GTX980 NVIDIA GeForce GTX980 ASUS GTX980-4GD5 NVIDIA GeForce GTX980 MSI GTX980-4GD5 NVIDIA GeForce GTX980 Gigabyte GV-N980D5-4GD-B NVIDIA GeForce GTX980 MSI GTX970 GAMING 4G GOLDEN EDITION NVIDIA GeForce GTX970 Gigabyte GV-N970IXOC-4GD NVIDIA GeForce GTX970 ASUS GTX780TI-3GD5 NVIDIA GeForce GTX780Ti ASUS GTX770-DC2OC-2GD5 NVIDIA GeForce GTX770 ASUS GTX760-DC2OC-2GD5 NVIDIA GeForce GTX760 ASUS GTX750TI-OC-2GD5 NVIDIA GeForce GTX750Ti ASUS ENGTX560-Ti-DCII/2D1-1GD5/1G NVIDIA GeForce GTX560Ti Gigabyte GV-NTITAN-6GD-B NVIDIA GeForce GTX TITAN Gigabyte GV-N78TWF3-3GD NVIDIA GeForce GTX780Ti Gigabyte GV-N780WF3-3GD NVIDIA GeForce GTX780 Gigabyte GV-N760OC-4GD NVIDIA GeForce GTX760 Gigabyte GV-N75TOC-2GI NVIDIA GeForce GTX750Ti MSI NTITAN-6GD5 NVIDIA GeForce GTX TITAN MSI GTX 780Ti 3GD5 NVIDIA GeForce GTX780Ti MSI N780-3GD5 NVIDIA GeForce GTX780 MSI N770-2GD5/OC NVIDIA GeForce GTX770 MSI N760-2GD5 NVIDIA GeForce GTX760 MSI N750 TF 1GD5/OC NVIDIA GeForce GTX750 MSI GTX680-2GB/DDR5 NVIDIA GeForce GTX680 MSI N660Ti-PE-2GD5-OC/2G-DDR5 NVIDIA GeForce GTX660Ti MSI N680GTX Twin Frozr 2GD5/OC NVIDIA GeForce GTX680 GIGABYTE GV-N670OC-2GD NVIDIA GeForce GTX670 GIGABYTE GV-N650OC-1GI/1G-DDR5 NVIDIA GeForce GTX650 GIGABYTE GV-N590D5-3GD-B NVIDIA GeForce GTX590 MSI N580GTX-M2D15D5/1.5G NVIDIA GeForce GTX580 MSI N465GTX-M2D1G-B NVIDIA GeForce GTX465 LEADTEK GTX275/896M-DDR3 NVIDIA GeForce GTX275 LEADTEK PX8800 GTX TDH NVIDIA GeForce 8800GTX GIGABYTE GV-N26-896H-B
    [Show full text]
  • NVIDIA Quadro RTX for V-Ray Next
    NVIDIA QUADRO RTX V-RAY NEXT GPU Image courtesy of © Dabarti Studio, rendered with V-Ray GPU Quadro RTX Accelerates V-Ray Next GPU Rendering Solutions for V-Ray Next GPU V-Ray Next GPU taps into the power of NVIDIA® Quadro® NVIDIA Quadro® provides a wide range of RTX-enabled RTX™ to speed up production rendering with dedicated RT solutions for desktop, mobile, server-based rendering, and Cores for ray tracing and Tensor Cores for AI-accelerated virtual workstations with NVIDIA Quadro Virtual Data denoising.¹ With up to 18X faster rendering than CPU-based Center Workstation (Quadro vDWS) software.2 With up to 96 solutions and enhanced performance with NVIDIA NVLink™, gigabytes (GB) of GPU memory available,3 Quadro RTX V-Ray Next GPU with RTX support provides incredible provides the power you need for the largest professional performance improvements for your rendering workloads. graphics and rendering workloads. “ Accelerating artist productivity is always our top Benchmark: V-Ray Next GPU Rendering Performance Increase on Quadro RTX GPUs priority, so we’re quick to take advantage of the latest ray-tracing hardware breakthroughs. By Quadro RTX 6000 x2 1885 ™ Quadro RTX 6000 104 supporting NVIDIA RTX in V-Ray GPU, we’re Quadro RTX 4000 783 bringing our customers an exciting new boost in PU 1 0 2 4 6 8 10 12 14 16 18 20 their GPU production rendering speeds.” Relatve Performance – Phillip Miller, Vice President, Product Management, Chaos Group Desktop performance Tests run on 1x Xeon old 6154 3 Hz (37 Hz Turbo), 64 B DDR4 RAM Wn10x64 Drver verson 44128 Performance results may vary dependng on the scene NVIDIA Quadro professional graphics solutions are verified and recommended for the most demanding projects by Chaos Group.
    [Show full text]
  • Nvidia Tesla P40 Gpu Accelerator
    NVIDIA TESLA P40 GPU ACCELERATOR HIGH-PERFORMANCE VIRTUAL GRAPHICS AND COMPUTE NVIDIA redefined visual computing by giving designers, engineers, scientists, and graphic artists the power to take on the biggest visualization challenges with immersive, interactive, photorealistic environments. NVIDIA® Quadro® Virtual Data GPU 1 NVIDIA Pascal GPU Center Workstation (Quadro vDWS) takes advantage of NVIDIA® CUDA Cores 3,840 Tesla® GPUs to deliver virtual workstations from the data center. Memory Size 24 GB GDDR5 H.264 1080p30 streams 24 Architects, engineers, and designers are now liberated from Max vGPU instances 24 (1 GB Profile) their desks and can access applications and data anywhere. vGPU Profiles 1 GB, 2 GB, 3 GB, 4 GB, 6 GB, 8 GB, 12 GB, 24 GB ® ® The NVIDIA Tesla P40 GPU accelerator works with NVIDIA Form Factor PCIe 3.0 Dual Slot Quadro vDWS software and is the first system to combine an (rack servers) Power 250 W enterprise-grade visual computing platform for simulation, Thermal Passive HPC rendering, and design with virtual applications, desktops, and workstations. This gives organizations the freedom to virtualize both complex visualization and compute (CUDA and OpenCL) workloads. The NVIDIA® Tesla® P40 taps into the industry-leading NVIDIA Pascal™ architecture to deliver up to twice the professional graphics performance of the NVIDIA® Tesla® M60 (Refer to Performance Graph). With 24 GB of framebuffer and 24 NVENC encoder sessions, it supports 24 virtual desktops (1 GB profile) or 12 virtual workstations (2 GB profile), providing the best end-user scalability per GPU. This powerful GPU also supports eight different user profiles, so virtual GPU resources can be efficiently provisioned to meet the needs of the user.
    [Show full text]
  • NVIDIA Launches Tegra X1 Mobile Super Chip
    NVIDIA Launches Tegra X1 Mobile Super Chip Maxwell GPU Architecture Delivers First Teraflops Mobile Processor, Powering Deep Learning and Computer Vision Applications NVIDIA today unveiled Tegra® X1, its next-generation mobile super chip with over one teraflops of processing power – delivering capabilities that open the door to unprecedented graphics and sophisticated deep learning and computer vision applications. Tegra X1 is built on the same NVIDIA Maxwell™ GPU architecture rolled out only months ago for the world's top-performing gaming graphics card, the GeForce® GTX 980. The 256-core Tegra X1 provides twice the performance of its predecessor, the Tegra K1, which is based on the previous-generation Kepler™ architecture and debuted at last year's Consumer Electronics Show. Tegra processors are built for embedded products, mobile devices, autonomous machines and automotive applications. Tegra X1 will begin appearing in the first half of the year. It will be featured in the newly announced NVIDIA DRIVE™ car computers. DRIVE PX is an auto-pilot computing platform that can process video from up to 12 onboard cameras to run capabilities providing Surround-Vision, for a seamless 360-degree view around the car, and Auto-Valet, for true self-parking. DRIVE CX is a complete cockpit platform designed to power the advanced graphics required across the increasing number of screens used for digital clusters, infotainment, head-up displays, virtual mirrors and rear-seat entertainment. "We see a future of autonomous cars, robots and drones that see and learn, with seeming intelligence that is hard to imagine," said Jen-Hsun Huang, CEO and co-founder, NVIDIA.
    [Show full text]
  • GPU-Based Deep Learning Inference
    Whitepaper GPU-Based Deep Learning Inference: A Performance and Power Analysis November 2015 1 Contents Abstract ......................................................................................................................................................... 3 Introduction .................................................................................................................................................. 3 Inference versus Training .............................................................................................................................. 4 GPUs Excel at Neural Network Inference ..................................................................................................... 5 Inference Optimizations in Caffe and cuDNN 4 ........................................................................................ 5 Experimental Setup and Testing Methodology ........................................................................................ 7 Inference on Small and Large GPUs .......................................................................................................... 8 Conclusion ................................................................................................................................................... 10 References .................................................................................................................................................. 10 2 Abstract Deep learning methods are revolutionizing various areas of machine perception. On a
    [Show full text]
  • Nvidia Cuda Getting Started Guide for Linux
    NVIDIA CUDA GETTING STARTED GUIDE FOR LINUX DU-05347-001_v7.0 | March 2015 Installation and Verification on Linux Systems TABLE OF CONTENTS Chapter 1. Introduction.........................................................................................1 1.1. System Requirements.................................................................................... 1 1.1.1. x86 32-bit Support.................................................................................. 2 1.2. About This Document.................................................................................... 3 Chapter 2. Pre-installation Actions...........................................................................4 2.1. Verify You Have a CUDA-Capable GPU................................................................ 4 2.2. Verify You Have a Supported Version of Linux.......................................................4 2.3. Verify the System Has gcc Installed................................................................... 5 2.4. Choose an Installation Method......................................................................... 5 2.5. Download the NVIDIA CUDA Toolkit....................................................................6 2.6. Handle Conflicting Installation Methods.............................................................. 6 Chapter 3. Package Manager Installation....................................................................8 3.1. Overview..................................................................................................
    [Show full text]
  • Cuda C Programming Guide
    CUDA C PROGRAMMING GUIDE PG-02829-001_v10.0 | October 2018 Design Guide CHANGES FROM VERSION 9.0 ‣ Documented restriction that operator-overloads cannot be __global__ functions in Operator Function. ‣ Removed guidance to break 8-byte shuffles into two 4-byte instructions. 8-byte shuffle variants are provided since CUDA 9.0. See Warp Shuffle Functions. ‣ Passing __restrict__ references to __global__ functions is now supported. Updated comment in __global__ functions and function templates. ‣ Documented CUDA_ENABLE_CRC_CHECK in CUDA Environment Variables. ‣ Warp matrix functions now support matrix products with m=32, n=8, k=16 and m=8, n=32, k=16 in addition to m=n=k=16. www.nvidia.com CUDA C Programming Guide PG-02829-001_v10.0 | ii TABLE OF CONTENTS Chapter 1. Introduction.........................................................................................1 1.1. From Graphics Processing to General Purpose Parallel Computing............................... 1 1.2. CUDA®: A General-Purpose Parallel Computing Platform and Programming Model.............3 1.3. A Scalable Programming Model.........................................................................4 1.4. Document Structure...................................................................................... 5 Chapter 2. Programming Model............................................................................... 7 2.1. Kernels......................................................................................................7 2.2. Thread Hierarchy........................................................................................
    [Show full text]
  • NVIDIA Ampere GA102 GPU Architecture Whitepaper
    NVIDIA AMPERE GA102 GPU ARCHITECTURE Second-Generation RTX Updated with NVIDIA RTX A6000 and NVIDIA A40 Information V2.0 Table of Contents Introduction 5 GA102 Key Features 7 2x FP32 Processing 7 Second-Generation RT Core 7 Third-Generation Tensor Cores 8 GDDR6X and GDDR6 Memory 8 Third-Generation NVLink® 8 PCIe Gen 4 9 Ampere GPU Architecture In-Depth 10 GPC, TPC, and SM High-Level Architecture 10 ROP Optimizations 11 GA10x SM Architecture 11 2x FP32 Throughput 12 Larger and Faster Unified Shared Memory and L1 Data Cache 13 Performance Per Watt 16 Second-Generation Ray Tracing Engine in GA10x GPUs 17 Ampere Architecture RTX Processors in Action 19 GA10x GPU Hardware Acceleration for Ray-Traced Motion Blur 20 Third-Generation Tensor Cores in GA10x GPUs 24 Comparison of Turing vs GA10x GPU Tensor Cores 24 NVIDIA Ampere Architecture Tensor Cores Support New DL Data Types 26 Fine-Grained Structured Sparsity 26 NVIDIA DLSS 8K 28 GDDR6X Memory 30 RTX IO 32 Introducing NVIDIA RTX IO 33 How NVIDIA RTX IO Works 34 Display and Video Engine 38 DisplayPort 1.4a with DSC 1.2a 38 HDMI 2.1 with DSC 1.2a 38 Fifth Generation NVDEC - Hardware-Accelerated Video Decoding 39 AV1 Hardware Decode 40 Seventh Generation NVENC - Hardware-Accelerated Video Encoding 40 NVIDIA Ampere GA102 GPU Architecture ii Conclusion 42 Appendix A - Additional GeForce GA10x GPU Specifications 44 GeForce RTX 3090 44 GeForce RTX 3070 46 Appendix B - New Memory Error Detection and Replay (EDR) Technology 49 Appendix C - RTX A6000 GPU Perf ormance 50 List of Figures Figure 1.
    [Show full text]
  • Geforce® Gtx 680 2Gb Vcggtx680xpb
    FILENAME: PNY-GeForce-GTX-680-2GB MODIFIED: April 25, 2013 12:02 PM GEFORCE® GTX 680 2GB VCGGTX680XPB Game-Changing Innovation PRODUCT SPECIFICATIONS: PRODUCT SPECIFICATIONS The GeForce® GTX 680 delivers more than just state-of-the-art features and technology. It gives you truly GPU GeForce® GTX 680 ® gaming-changing performance that taps into the power next generation GeForce Kepler-architecture to CUDA Cores 1536 redefine smooth, seamless, lifelike gaming. NVIDIA GPU Boost technology that dynamically maximize clock speeds to push performance to the new levels and bring out the best in every game. Ultra-smooth Core Speed 1006 MHz gaming with exciting new NVIDIA Adaptive V-Sync technology. Boost Speed 1058 MHz Memory Size 2048 MB GDDR5 PACKAGE INCLUDES REQUIREMENTS Memory • GeForce® GTX 680 2GB graphics card • PCI Express compliant motherboard with one 256 bit Interface • Quick Installation Guide dual-width x16 graphics slot • Installation DVD, which includes: • Two 6-pin PCI Express supplementary Memory 6.0 Gbps Detailed installation guide power connectors Frequency ® - NVIDIA Graphics Drivers • Minimum 550W or greater system power supply TDP 195W-Active - NVIDIA® GeForce® Demos (with a minimum 12V current rating of 38A) - PhysX™ System Software • 300 MB of available hard-disk space 2GB SLI 3-Way SLI • DVI-to-VGA adapter system memory (2GB or higher recommended) 4 Concurrent Screens*** Multi-Screen • 6-pin PCIe to Molex Power Adapter • Microsoft Windows 7, Windows Vista, or NVIDIA 3D Surround* Windows XP Operating System (32 or 64-bit)
    [Show full text]
  • PACKET 22 BOOKSTORE, TEXTBOOK CHAPTER Reading Graphics
    A.11 GRAPHICS CARDS, Historical Perspective (edited by J Wunderlich PhD in 2020) Graphics Pipeline Evolution 3D graphics pipeline hardware evolved from the large expensive systems of the early 1980s to small workstations and then to PC accelerators in the 1990s, to $X,000 graphics cards of the 2020’s During this period, three major transitions occurred: 1. Performance-leading graphics subsystems PRICE changed from $50,000 in 1980’s down to $200 in 1990’s, then up to $X,0000 in 2020’s. 2. PERFORMANCE increased from 50 million PIXELS PER SECOND in 1980’s to 1 billion pixels per second in 1990’’s and from 100,000 VERTICES PER SECOND to 10 million vertices per second in the 1990’s. In the 2020’s performance is measured more in FRAMES PER SECOND (FPS) 3. Hardware RENDERING evolved from WIREFRAME to FILLED POLYGONS, to FULL- SCENE TEXTURE MAPPING Fixed-Function Graphics Pipelines Throughout the early evolution, graphics hardware was configurable, but not programmable by the application developer. With each generation, incremental improvements were offered. But developers were growing more sophisticated and asking for more new features than could be reasonably offered as built-in fixed functions. The NVIDIA GeForce 3, described by Lindholm, et al. [2001], took the first step toward true general shader programmability. It exposed to the application developer what had been the private internal instruction set of the floating-point vertex engine. This coincided with the release of Microsoft’s DirectX 8 and OpenGL’s vertex shader extensions. Later GPUs, at the time of DirectX 9, extended general programmability and floating point capability to the pixel fragment stage, and made texture available at the vertex stage.
    [Show full text]
  • Advanced Computing
    National Aeronautics and Space Administration S NASA Advanced Computing e NASA Advanced Supercomputing Division has been the agency’s primary resource for high performance computing, data storage, and advanced modeling and simulation tools for over 30 years. From the 1.9 gigaflop Cray-2 system installed in 1985 to the current petascale Pleiades, Electra, and Aitken superclusters, our facility at NASA’s Ames Research Center in Silicon Valley has housed over 40 production and testbed super- computers supporting NASA missions and projects in aeronautics, human space exploration, Earth science, and astrophysics. www.nasa.gov SUPERCOMPUTING (NAS) DIVISION NASA ADVANCED e NASA Advanced Supercomputing (NAS) facility’s NVIDIA Tesla V100 GPUs, to the Pleiades supercomputer, computing environment includes the three most powerful providing dozens of teraflops of computational boost to supercomputers in the agency: the petascale Electra, Aitken, each of the enhanced nodes. and Pleiades systems. Developed with a focus on flexible scalability, the systems at NAS have the ability to expand e NAS facility also houses several smaller systems to sup- and upgrade hardware with minimal impact to users, allow- port various computational needs, including the Endeavour ing us the ability to continually provide the most advanced shared-memory system for large-memory jobs, and the computing technologies to support NASA’s many inspiring hyperwall visualization system, which provides a unique missions and projects. environment for researchers to explore their very large, high-dimensional datasets. Additionally, we support both Part of this hardware diversity includes the integration of short-term RAID and long-term tape mass storage systems, graphics processing units (GPUs), which can speed up providing more than 1,500 users running jobs on the some codes and algorithms run on NAS systems.
    [Show full text]
  • Virtual GPU Software User Guide Is Organized As Follows: ‣ This Chapter Introduces the Capabilities and Features of NVIDIA Vgpu Software
    Virtual GPU Software User Guide DU-06920-001 _v13.0 Revision 02 | August 2021 Table of Contents Chapter 1. Introduction to NVIDIA vGPU Software..............................................................1 1.1. How NVIDIA vGPU Software Is Used....................................................................................... 1 1.1.2. GPU Pass-Through.............................................................................................................1 1.1.3. Bare-Metal Deployment.....................................................................................................1 1.2. Primary Display Adapter Requirements for NVIDIA vGPU Software Deployments................2 1.3. NVIDIA vGPU Software Features............................................................................................. 3 1.3.1. GPU Instance Support on NVIDIA vGPU Software............................................................3 1.3.2. API Support on NVIDIA vGPU............................................................................................ 5 1.3.3. NVIDIA CUDA Toolkit and OpenCL Support on NVIDIA vGPU Software...........................5 1.3.4. Additional vWS Features....................................................................................................8 1.3.5. NVIDIA GPU Cloud (NGC) Containers Support on NVIDIA vGPU Software...................... 9 1.3.6. NVIDIA GPU Operator Support.......................................................................................... 9 1.4. How this Guide Is Organized..................................................................................................10
    [Show full text]