Gen Vulkan API

Total Page:16

File Type:pdf, Size:1020Kb

Gen Vulkan API Faculty of Science and Technology Department of Computer Science A closer look at problems related to the next- gen Vulkan API — Håvard Mathisen INF-3981 Master’s Thesis in Computer Science June 2017 Abstract Vulkan is a significantly lower-level graphics API than OpenGL and require more effort from application developers to do memory management, synchronization, and other low-level tasks that are spe- cific to this API. The API is closer to the hardware and offer features that is not exposed in older APIs. For this thesis we will extend an existing game engine with a Vulkan back-end. This allows us to eval- uate the API and compare with OpenGL. We find ways to efficiently solve some challenges encountered when using Vulkan. i Contents 1 Introduction 1 1.1 Goals . .2 2 Background 3 2.1 GPU Architecture . .3 2.2 GPU Drivers . .3 2.3 Graphics APIs . .5 2.3.1 What is Vulkan . .6 2.3.2 Why Vulkan . .7 3 Vulkan Overview 8 3.1 Vulkan Architecture . .8 3.2 Vulkan Execution Model . .8 3.3 Vulkan Tools . .9 4 Vulkan Objects 10 4.1 Instances, Physical Devices, Devices . 10 4.1.1 Lost Device . 12 4.2 Command buffers . 12 4.3 Queues . 13 4.4 Memory Management . 13 4.4.1 Memory Heaps . 13 4.4.2 Memory Types . 14 4.4.3 Host visible memory . 14 4.4.4 Memory Alignment, Aliasing and Allocation Limitations 15 4.5 Synchronization . 15 4.5.1 Execution dependencies . 16 4.5.2 Memory dependencies . 16 4.5.3 Image Layout Transitions . 17 4.5.4 Queue Family Ownership Transfers . 17 4.6 Render Pass . 17 4.7 Shaders . 18 4.8 Pipeline State Objects . 18 4.9 Resource Descriptors . 18 5 A Vulkan Game Engine 20 5.1 Engine Overview . 20 5.2 Previous work on DirectX 12 . 20 ii 5.3 Designing a Vulkan Graphics Engine . 21 5.4 Command Buffers . 21 5.4.1 Multi-threading . 22 5.5 Memory Management . 22 5.6 Synchronization . 23 5.7 Vulkan API . 23 5.8 Debug Markers . 24 6 Results 25 6.1 Vulkan vs OpenGL . 25 6.2 Higher Graphics Settings . 26 6.3 Async Compute . 26 7 Discussion 28 7.1 Vulkan vs OpenGL . 28 7.2 Command buffers and multi-threading . 28 7.3 Queues . 30 7.4 Memory Management . 30 7.5 Synchronization . 31 7.6 The Vulkan API . 31 7.7 Validation layers . 31 8 Conclusion 32 9 Figures 35 iii 1 Introduction GPUs have evolved significantly since their early history to meet the de- mand for better graphics and smoother frame-rates. Exposing new hardware features to developers have been done through extending graphics APIs like OpenGL. OpenGL had its initial release in 1992 and was designed for graphics hardware that was significantly different from the modern GPU. It has prob- lems adapting to modern multi-core processors, new GPU architectures and applications that required more efficient and predictable performance. One example of such an application is Gear VR, which combines mobile graphics with VR. VR require low latency, high performance and more predictable performance while mobile graphics require high efficiency, less power usage and support for tile-based GPU architectures. It was time for a grounds-up redesign even though OpenGL has aged quite well through added extensions and new versions. Vulkan is an API that is designed to meet the demands of modern graphics applications. Not only does the API expose new graphics hardware features, it is also designed to be programmed using modern multi-core processors. Some advantages of Vulkan are: • Designed to allow for more efficient use of CPU and GPU resources. The added efficiency comes from a closer mapping of the API to the hardware. • It is a lower-level1 API that giver more control to developers. • Thinner drivers with less overhead and latency that should remove some micro-stuttering that older drivers had. Drivers should not have to do any run-time shader re-compilations. • Intended to scale to multiple threads. • Exposes new hardware queues for compute and DMA. A good motivation for learning about Vulkan is gaining a better under- standing of how GPU drivers work. In this project we will take a closer look at Vulkan. 1Note that a lower-level API is not the same as a low-level API 1 1.1 Goals The main goal for this thesis is how we can design a graphics engine to better utilize the underlying hardware by using the Vulkan API. Central questions are: • How can we multi-thread the tasks involved in generating commands for the GPU • How can we utilize new hardware features the API exposes like the new queues for doing compute and DMA concurrently to the graphics engine • What is the best way to do memory management • How can we best manage and synchronize resources • How does Vulkan compare to OpenGL • How do we make useful abstraction that minimize complexity of the API To answer these questions we will design and implement a Vulkan graphics engine for an existing game engine written by the author. 2 2 Background In this chapter, we explain modern GPU architectures and drivers before introducing the Vulkan graphics API. 2.1 GPU Architecture Modern GPU architectures come in multiple forms. We have both GPUs in- tegrated on a SOC, and we have standalone discrete graphics cards. We have GPUs for mobile and GPUs for desktop. Different GPUs have different tech- nical capabilities. Integrated GPUs all have a uniform memory architecture (UMA), meaning that the CPU and the GPU share memory, while discrete GPUs have dedicated memory in addition to sharing system memory with the CPU. Modern GPUs also share virtual memory space with the CPU. Desktop GPUs usually use a feed-forward rasterizing architecture while mobile GPUs use a tiled rendering architecture. Tiled rendering works by deferring the rasterization by rather storing the geometry data of the scene in a screen-space tiled cache that is later used to render the scene a tile at the time. By using this technique we can move the framebuffer out of main memory and into high-speed on-chip memory which can reduce the used memory bandwidth [16]. Even discrete desktop GPUs are starting to use similar techniques to reduce memory bandwidth23. Modern GPUs have features not exposed in the OpenGL API. They can have multiple compute engines that can execute compute workloads asyn- chronously to the graphics engine. They can also execute memory copies using the DMA engine asynchronously to the other engines. 2.2 GPU Drivers GPU drivers work by packing commands for the GPU into command buffers. There are two components in a graphics driver, a user space library and a kernel module. Commands like Draw*() or Dispatch() are not executed immediately on the GPU when the function is called, but rather staged for later execution in a command buffer in the user space driver. When the command buffer fills up with enough commands they are optimized and sent to the kernel. The kernel ensures that the commands are valid and don't access memory not belonging to the application before staging the 2http://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/ 3http://www.anandtech.com/show/11002/the-amd-vega-gpu-architecture-teaser/ 3 3 Figure 1: OpenGL Driver commands for execution on the GPU. When the GPU runs out of commands in the current command buffer it is executing, an interrupt is sent to the OS requesting a new command buffer to execute. The GPU front-end can fetch its own commands from command buffers in system memory through DMA operations and is executing commands at its own pace. Figure 1 shows an overview of GPU drivers. As an optimization, some drivers allow the optimization step of the com- mand buffers to be done in a separate driver thread as shown in Figure 2. This makes draw and dispatch calls really fast in the application but comes at the cost of additional latency. To take full advantage of this technique it might also be necessary for the application to triple buffer per-frame re- sources, as opposed to the traditional double buffering. One buffer is used by the application, one by the driver thread and one by the GPU. This comes at the cost of extra memory usage. Marchesin [14] has an extensive but not finished introduction to Linux graphics drivers. There are multiple sources on Approaching Zero Driver Overhead (AZDO) techniques that shed light on the problems of the tradi- tional graphics driver architecture and how to circumvent those [4] [5] [10]. 4 Figure 2: Multi-threaded OpenGL Driver AZDO is a collection of multiple different GPU techniques to remove driver overhead. The most predominant AZDO techniques are about moving the logic used to select which resources should be used by a shader from the CPU to the shader itself. It is often recommended to start with the AZDO techniques when learning about GPU drivers in-depth. McDonald [15] has a presentation about driver models and how to avoid sync points. 2.3 Graphics APIs Modern graphics is based around a pipeline that specifies some fixed function stages and some programmable shader stages. Fixed function stages consists of steps like vertex fetching, rasterization, fragment operations and tessel- lation primitive generation. Programmable stages are the vertex shader, fragment shader, geometry shader, and tessellation control and evaluation shaders. The pipeline begins by fetching an index buffer used to look up a vertex buffer. Vertexes are processed by vertex shaders and some optional stages before being assembled into polygons.
Recommended publications
  • Real-Time Finite Element Method (FEM) and Tressfx
    REAL-TIME FEM AND TRESSFX 4 ERIC LARSEN KARL HILLESLAND 1 FEBRUARY 2016 | CONFIDENTIAL FINITE ELEMENT METHOD (FEM) SIMULATION Simulates soft to nearly-rigid objects, with fracture Models object as mesh of tetrahedral elements Each element has material parameters: ‒ Young’s Modulus: How stiff the material is ‒ Poisson’s ratio: Effect of deformation on volume ‒ Yield strength: Deformation limit before permanent shape change ‒ Fracture strength: Stress limit before the material breaks 2 FEBRUARY 2016 | CONFIDENTIAL MOTIVATIONS FOR THIS METHOD Parameters give a lot of design control Can model many real-world materials ‒Rubber, metal, glass, wood, animal tissue Commonly used now for film effects ‒High-quality destruction Successful real-time use in Star Wars: The Force Unleashed 1 & 2 ‒DMM middleware [Parker and O’Brien] 3 FEBRUARY 2016 | CONFIDENTIAL OUR PROJECT New implementation of real-time FEM for games Planned CPU library release ‒Heavy use of multithreading ‒Open-source with GPUOpen license Some highlights ‒Practical method for continuous collision detection (CCD) ‒Mix of CCD and intersection contact constraints ‒Efficient integrals for intersection constraint 4 FEBRUARY 2016 | CONFIDENTIAL STATUS Proof-of-concept prototype First pass at optimization Offering an early look for feedback Several generic components 5 FEBRUARY 2016 | CONFIDENTIAL CCD Find time of impact between moving objects ‒Impulses can prevent intersections [Otaduy et al.] ‒Catches collisions with fast-moving objects Our approach ‒Conservative-advancement based ‒Geometric
    [Show full text]
  • AMD Powerpoint- White Template
    RDNA Architecture Forward-looking statement This presentation contains forward-looking statements concerning Advanced Micro Devices, Inc. (AMD) including, but not limited to, the features, functionality, performance, availability, timing, pricing, expectations and expected benefits of AMD’s current and future products, which are made pursuant to the Safe Harbor provisions of the Private Securities Litigation Reform Act of 1995. Forward-looking statements are commonly identified by words such as "would," "may," "expects," "believes," "plans," "intends," "projects" and other terms with similar meaning. Investors are cautioned that the forward-looking statements in this presentation are based on current beliefs, assumptions and expectations, speak only as of the date of this presentation and involve risks and uncertainties that could cause actual results to differ materially from current expectations. Such statements are subject to certain known and unknown risks and uncertainties, many of which are difficult to predict and generally beyond AMD's control, that could cause actual results and other future events to differ materially from those expressed in, or implied or projected by, the forward-looking information and statements. Investors are urged to review in detail the risks and uncertainties in AMD's Securities and Exchange Commission filings, including but not limited to AMD's Quarterly Report on Form 10-Q for the quarter ended March 30, 2019 2 Highlights of the RDNA Workgroup Processor (WGP) ▪ Designed for lower latency and higher
    [Show full text]
  • Candidate Features for Future Opengl 5 / Direct3d 12 Hardware and Beyond 3 May 2014, Christophe Riccio
    Candidate features for future OpenGL 5 / Direct3D 12 hardware and beyond 3 May 2014, Christophe Riccio G-Truc Creation Table of contents TABLE OF CONTENTS 2 INTRODUCTION 4 1. DRAW SUBMISSION 6 1.1. GL_ARB_MULTI_DRAW_INDIRECT 6 1.2. GL_ARB_SHADER_DRAW_PARAMETERS 7 1.3. GL_ARB_INDIRECT_PARAMETERS 8 1.4. A SHADER CODE PATH PER DRAW IN A MULTI DRAW 8 1.5. SHADER INDEXED LOSE STATES 9 1.6. GL_NV_BINDLESS_MULTI_DRAW_INDIRECT 10 1.7. GL_AMD_INTERLEAVED_ELEMENTS 10 2. RESOURCES 11 2.1. GL_ARB_BINDLESS_TEXTURE 11 2.2. GL_NV_SHADER_BUFFER_LOAD AND GL_NV_SHADER_BUFFER_STORE 11 2.3. GL_ARB_SPARSE_TEXTURE 12 2.4. GL_AMD_SPARSE_TEXTURE 12 2.5. GL_AMD_SPARSE_TEXTURE_POOL 13 2.6. SEAMLESS TEXTURE STITCHING 13 2.7. 3D MEMORY LAYOUT FOR SPARSE 3D TEXTURES 13 2.8. SPARSE BUFFER 14 2.9. GL_KHR_TEXTURE_COMPRESSION_ASTC 14 2.10. GL_INTEL_MAP_TEXTURE 14 2.11. GL_ARB_SEAMLESS_CUBEMAP_PER_TEXTURE 15 2.12. DMA ENGINES 15 2.13. UNIFIED MEMORY 16 3. SHADER OPERATIONS 17 3.1. GL_ARB_SHADER_GROUP_VOTE 17 3.2. GL_NV_SHADER_THREAD_GROUP 17 3.3. GL_NV_SHADER_THREAD_SHUFFLE 17 3.4. GL_NV_SHADER_ATOMIC_FLOAT 18 3.5. GL_AMD_SHADER_ATOMIC_COUNTER_OPS 18 3.6. GL_ARB_COMPUTE_VARIABLE_GROUP_SIZE 18 3.7. MULTI COMPUTE DISPATCH 19 3.8. GL_NV_GPU_SHADER5 19 3.9. GL_AMD_GPU_SHADER_INT64 20 3.10. GL_AMD_GCN_SHADER 20 3.11. GL_NV_VERTEX_ATTRIB_INTEGER_64BIT 21 3.12. GL_AMD_ SHADER_TRINARY_MINMAX 21 4. FRAMEBUFFER 22 4.1. GL_AMD_SAMPLE_POSITIONS 22 4.2. GL_EXT_FRAMEBUFFER_MULTISAMPLE_BLIT_SCALED 22 4.3. GL_NV_MULTISAMPLE_COVERAGE AND GL_NV_FRAMEBUFFER_MULTISAMPLE_COVERAGE 22 4.4. GL_AMD_DEPTH_CLAMP_SEPARATE 22 5. BLENDING 23 5.1. GL_NV_TEXTURE_BARRIER 23 5.2. GL_EXT_SHADER_FRAMEBUFFER_FETCH (OPENGL ES) 23 5.3. GL_ARM_SHADER_FRAMEBUFFER_FETCH (OPENGL ES) 23 5.4. GL_ARM_SHADER_FRAMEBUFFER_FETCH_DEPTH_STENCIL (OPENGL ES) 23 5.5. GL_EXT_PIXEL_LOCAL_STORAGE (OPENGL ES) 24 5.6. TILE SHADING 25 5.7. GL_INTEL_FRAGMENT_SHADER_ORDERING 26 5.8. GL_KHR_BLEND_EQUATION_ADVANCED 26 5.9.
    [Show full text]
  • Comparison of Technologies for General-Purpose Computing on Graphics Processing Units
    Master of Science Thesis in Information Coding Department of Electrical Engineering, Linköping University, 2016 Comparison of Technologies for General-Purpose Computing on Graphics Processing Units Torbjörn Sörman Master of Science Thesis in Information Coding Comparison of Technologies for General-Purpose Computing on Graphics Processing Units Torbjörn Sörman LiTH-ISY-EX–16/4923–SE Supervisor: Robert Forchheimer isy, Linköpings universitet Åsa Detterfelt MindRoad AB Examiner: Ingemar Ragnemalm isy, Linköpings universitet Organisatorisk avdelning Department of Electrical Engineering Linköping University SE-581 83 Linköping, Sweden Copyright © 2016 Torbjörn Sörman Abstract The computational capacity of graphics cards for general-purpose computing have progressed fast over the last decade. A major reason is computational heavy computer games, where standard of performance and high quality graphics con- stantly rise. Another reason is better suitable technologies for programming the graphics cards. Combined, the product is high raw performance devices and means to access that performance. This thesis investigates some of the current technologies for general-purpose computing on graphics processing units. Tech- nologies are primarily compared by means of benchmarking performance and secondarily by factors concerning programming and implementation. The choice of technology can have a large impact on performance. The benchmark applica- tion found the difference in execution time of the fastest technology, CUDA, com- pared to the slowest, OpenCL, to be twice a factor of two. The benchmark applica- tion also found out that the older technologies, OpenGL and DirectX, are compet- itive with CUDA and OpenCL in terms of resulting raw performance. iii Acknowledgments I would like to thank Åsa Detterfelt for the opportunity to make this thesis work at MindRoad AB.
    [Show full text]
  • The Amd Linux Graphics Stack – 2018 Edition Nicolai Hähnle Fosdem 2018
    THE AMD LINUX GRAPHICS STACK – 2018 EDITION NICOLAI HÄHNLE FOSDEM 2018 1FEBRUARY 2018 | CONFIDENTIAL GRAPHICS STACK: KERNEL / USER-SPACE / X SERVER Mesa OpenGL & Multimedia Vulkan Vulkan radv AMDVLK OpenGL X Server radeonsi Pro/ r600 Workstation radeon amdgpu LLVM SCPC libdrm radeon amdgpu FEBRUARY 2018 | AMD LINUX GRAPHICS STACK 2FEBRUARY 2018 | CONFIDENTIAL GRAPHICS STACK: OPEN-SOURCE / CLOSED-SOURCE Mesa OpenGL & Multimedia Vulkan Vulkan radv AMDVLK OpenGL X Server radeonsi Pro/ r600 Workstation radeon amdgpu LLVM SCPC libdrm radeon amdgpu FEBRUARY 2018 | AMD LINUX GRAPHICS STACK 3FEBRUARY 2018 | CONFIDENTIAL GRAPHICS STACK: SUPPORT FOR GCN / PRE-GCN HARDWARE ROUGHLY: GCN = NEW GPUS OF THE LAST 5 YEARS Mesa OpenGL & Multimedia Vulkan Vulkan radv AMDVLK OpenGL X Server radeonsi Pro/ r600 Workstation radeon amdgpu LLVM(*) SCPC libdrm radeon amdgpu (*) LLVM has pre-GCN support only for compute FEBRUARY 2018 | AMD LINUX GRAPHICS STACK 4FEBRUARY 2018 | CONFIDENTIAL GRAPHICS STACK: PHASING OUT “LEGACY” COMPONENTS Mesa OpenGL & Multimedia Vulkan Vulkan radv AMDVLK OpenGL X Server radeonsi Pro/ r600 Workstation radeon amdgpu LLVM SCPC libdrm radeon amdgpu FEBRUARY 2018 | AMD LINUX GRAPHICS STACK 5FEBRUARY 2018 | CONFIDENTIAL MAJOR MILESTONES OF 2017 . Upstreaming the DC display driver . Open-sourcing the AMDVLK Vulkan driver . Unified driver delivery . OpenGL 4.5 conformance in the open-source Mesa driver . Zero-day open-source support for new hardware FEBRUARY 2018 | AMD LINUX GRAPHICS STACK 6FEBRUARY 2018 | CONFIDENTIAL KERNEL: AMDGPU AND RADEON HARDWARE SUPPORT Pre-GCN radeon GCN 1st gen (Southern Islands, SI, gfx6) GCN 2nd gen (Sea Islands, CI(K), gfx7) GCN 3rd gen (Volcanic Islands, VI, gfx8) amdgpu GCN 4th gen (Polaris, RX 4xx, RX 5xx) GCN 5th gen (RX Vega, Ryzen Mobile, gfx9) FEBRUARY 2018 | AMD LINUX GRAPHICS STACK 7FEBRUARY 2018 | CONFIDENTIAL KERNEL: AMDGPU VS.
    [Show full text]
  • A Review of Gpuopen Effects
    A REVIEW OF GPUOPEN EFFECTS TAKAHIRO HARADA & JASON LACROIX • An initiative designed to help developers make better content by “opening up” the GPU • Contains a variety of software modules across various GPU needs: • Effects and render features • Tools, SDKs, and libraries • Patches and drivers • Software hosted on GitHub with no “black box” implementations or licensing fees • Website provides: • The latest news and information on all GPUOpen software • Tutorials and samples to help you optimise your game • A central location for up-to-date GPU and CPU documentation • Information about upcoming events and previous presentations AMD Public | Let’s build… 2020 | A Review of GPUOpen Effects | May 15, 2020 | 2 LET’S BUILD A NEW GPUOPEN… • Brand new, modern, dynamic website • Easy to find the information you need quickly • Read the latest news and see what’s popular • Learn new tips and techniques from our engineers • Looks good on mobile platforms too! • New social media presence • @GPUOpen AMD Public | Let’s build… 2020 | A Review of GPUOpen Effects | May 15, 2020 | 3 EFFECTS A look at recently released samples AMD Public | Let’s build… 2020 | A Review of GPUOpen Effects | May 15, 2020 | 4 TRESSFX 4.1 • Self-contained solution for hair simulation • Implementation into Radeon® Cauldron framework • DirectX® 12 and Vulkan® with full source • Optimized physics simulation • Faster velocity shock propagation • Simplified local shape constraints • Reorganization of dispatches • StrandUV support • New LOD system • New and improved Autodesk® Maya®
    [Show full text]
  • Club 3D Radeon R9 380 Royalqueen 4096MB GDDR5 256BIT | PCI EXPRESS 3.0
    Club 3D Radeon R9 380 royalQueen 4096MB GDDR5 256BIT | PCI EXPRESS 3.0 Product Name Club 3D Radeon R9 380 4GB royalQueen 4096MB GDDR5 256 BIT | PCI Express 3.0 Product Series Club 3D Radeon R9 300 Series codename ‘Antigua’ Itemcode CGAX-R93858 EAN code 8717249401469 UPC code 854365005428 Description: OS Support: The new Club 3D Radeon™ R9 380 4GB royalQueen was conceived to OS Support: Windows 7, Windows 8.1, Windows 10 play hte most demanding games at 1080p, 1440p, all the way up to 4K 3D API Support: DirextX 11.2, DirectX 12, Vulkan, AMD Mantle. resolution. Get quality that rivals 4K, even on 1080p displays thanks to VSR (Virtual Super Resolution). Loaded with the latest advancements in GCN architecture including AMD FreeSync™, AMD Eyefinity and AMD In the box: LiquidVR™ technologies plus support for the nex gen APIs DirectX® 12, • Club 3D R9 380 royalQueen Graphics card Vulkan™ and AMD mantle, the Club 3D R9 380 royalQueen is for the • Club 3D Driver & E-manual CD serious PC Gamer. • Club 3D gaming Door hanger • Quick install guide Features: Outputs: Other info: • Club 3D Radeon R9 380 royalQueen 4GB • DisplayPort 1.2a • Box size: 293 x 195 x 69 mm • 1792 Stream Processors • HDMI 1.4a • Card size: 207 x 111 x 38 mm • Clock speed up to 980 MHz • Dual Link DVI-I • Weight: 0.6 Kg • 4096 MB GDDR5 Memory at 5900MHz • Dual Link DVI-D • Profile: Standard profile • 256 BIT Memory Bus • Slot width: 2 Slots • High performance Dual Fan CoolStream cooler • Requires min 700w PSU with • PCI Express 3.0 two 6pin PCIe connectors • AMD Eyefinity 6 capable (with Club 3D MST Hub) with PLP support Outputs: • AMD Graphics core Next architecture • AMD PowerTune • AMD ZeroCore Power • AMD FreeSync support • AMD Bridgeless CrossFire • Custom backplate Quick install guide: PRODUCT LINK CLICK HERE Disclaimer: While we endeavor to provide the most accurate, up-to-date information available, the content on this document may be out of date or include omissions, inaccuracies or other errors.
    [Show full text]
  • SAPPHIRE R9 285 2GB GDDR5 ITX COMPACT OC Edition (UEFI)
    Specification Display Support 4 x Maximum Display Monitor(s) support 1 x HDMI (with 3D) Output 2 x Mini-DisplayPort 1 x Dual-Link DVI-I 928 MHz Core Clock GPU 28 nm Chip 1792 x Stream Processors 2048 MB Size Video Memory 256 -bit GDDR5 5500 MHz Effective 171(L)X110(W)X35(H) mm Size. Dimension 2 x slot Driver CD Software SAPPHIRE TriXX Utility DVI to VGA Adapter Mini-DP to DP Cable Accessory HDMI 1.4a high speed 1.8 meter cable(Full Retail SKU only) 1 x 8 Pin to 6 Pin x2 Power adaptor Overview HDMI (with 3D) Support for Deep Color, 7.1 High Bitrate Audio, and 3D Stereoscopic, ensuring the highest quality Blu-ray and video experience possible from your PC. Mini-DisplayPort Enjoy the benefits of the latest generation display interface, DisplayPort. With the ultra high HD resolution, the graphics card ensures that you are able to support the latest generation of LCD monitors. Dual-Link DVI-I Equipped with the most popular Dual Link DVI (Digital Visual Interface), this card is able to display ultra high resolutions of up to 2560 x 1600 at 60Hz. Advanced GDDR5 Memory Technology GDDR5 memory provides twice the bandwidth per pin of GDDR3 memory, delivering more speed and higher bandwidth. Advanced GDDR5 Memory Technology GDDR5 memory provides twice the bandwidth per pin of GDDR3 memory, delivering more speed and higher bandwidth. AMD Stream Technology Accelerate the most demanding applications with AMD Stream technology and do more with your PC. AMD Stream Technology allows you to use the teraflops of compute power locked up in your graphics processer on tasks other than traditional graphics such as video encoding, at which the graphics processor is many, many times faster than using the CPU alone.
    [Show full text]
  • Masterarbeit / Master's Thesis
    MASTERARBEIT / MASTER'S THESIS Titel der Masterarbeit / Title of the Master`s Thesis "Reducing CPU overhead for increased real time rendering performance" verfasst von / submitted by Daniel Martinek BSc angestrebter Akademischer Grad / in partial fulfilment of the requirements for the degree of Diplom-Ingenieur (Dipl.-Ing.) Wien, 2016 / Vienna 2016 Studienkennzahl lt. Studienblatt / A 066 935 degree programme code as it appears on the student record sheet: Studienrichtung lt. Studienblatt / Masterstudium Medieninformatik UG2002 degree programme as it appears on the student record sheet: Betreut von / Supervisor: Univ.-Prof. Dipl.-Ing. Dr. Helmut Hlavacs Contents 1 Introduction 1 1.1 Motivation . .1 1.2 Outline . .2 2 Introduction to real-time rendering 3 2.1 Using a graphics API . .3 2.2 API future . .6 3 Related Work 9 3.1 nVidia Bindless OpenGL Extensions . .9 3.2 Introducing the Programmable Vertex Pulling Rendering Pipeline . 10 3.3 Improving Performance by Reducing Calls to the Driver . 11 4 Libraries and Utilities 13 4.1 SDL . 13 4.2 glm . 13 4.3 ImGui . 14 4.4 STB . 15 4.5 Assimp . 16 4.6 RapidJSON . 16 4.7 DirectXTex . 16 5 Engine Architecture 17 5.1 breach . 17 5.2 graphics . 19 5.3 profiling . 19 5.4 input . 20 5.5 filesystem . 21 5.6 gui . 21 5.7 resources . 21 5.8 world . 22 5.9 rendering . 23 5.10 rendering2d . 23 6 Resource Conditioning 25 6.1 Materials . 26 i 6.2 Geometry . 27 6.3 World Data . 28 6.4 Textures . 29 7 Resource Management 31 7.1 Meshes .
    [Show full text]
  • Quickspecs AMD Firepro W5100 4GB Graphics
    QuickSpecs AMD FirePro W5100 4GB Graphics Overview AMD FirePro W5100 4GB Graphics AMD FirePro W5100 4GB Graphics J3G92AA INTRODUCTION The AMD FirePro™ W5100 workstation graphics card delivers impressive performance, superb visual quality, and outstanding multi-display capabilities all in a single-slot, <75W solution. It is an excellent mid-range solution for professionals who work with CAD & Engineering and Media & Entertainment applications. The AMD FirePro W5100 features four display outputs and AMD Eyefinity technology support, as well as support up to six simultaneous and independent monitors from a single graphics card via DisplayPort Multi-Streaming (see Note 1). Also, the AMD FirePro W5100 is backed by 4GB of ultra-fast GDDR5 memory. PERFORMANCE AND FEATURES AMD Graphics Core Next (GCN) architecture designed to effortlessly balance GPU compute and 3D workloads efficiently Segment leading compute architecture yielding up to 1.43 TFLOPS peak single precision Optimized and certified for leading workstation ISV applications. The AMD FirePro™ professional graphics family is certified on more than 100 different applications for reliable performance. GeometryBoost technology with dual primitive engines Four (4) native display DisplayPort 1.2a (with Adaptive-Sync) outputs with 4K resolution support Up to six display outputs using DisplayPort 1.2a and MST compliant displays, HBR2 support AMD Eyefinity technology (see Note 1) support managing up to 6 displays seamlessly as though they were one display c04513037 — DA - 15147 Worldwide
    [Show full text]
  • AMD Linux Driver 2021.10 Release Notes
    [AMD Official Use Only - Internal Distribution Only] AMD Linux Driver 2021.10 Release Notes 1. Overview AMD’s Linux® Driver’s includes open source graphics driver for AMD’s embedded platforms and other peripheral devices on selected development platforms. New features supported in this release: 1. New LTS kernel 5.10.5. 2. Bug fixes and driver updates. 2. Linux® kernel Support 1. 5.10.5 LTS 3. Linux Distribution Support 1. Ubuntu 20.04.1 4. Component Versions The following table shows git commit details of the sources and binaries used in the package. The patches present in patches folder of this release package has to be applied on top of the git commit mentioned in the below table to get the full sources corresponding to this driver release. The sources directory in this package contains patches pre-applied to these commit ids. 2021.10 Linux Driver Release Notes 1 [AMD Official Use Only - Internal Distribution Only] Component Version Commit ID Source Link for git clone Name Kernel 5.10.5 f5247949c0a9304ae43a895f29216a9d876f https://git.kernel.org/pub/scm/linux/ker 3919 nel/git/stable/linux.git Libdrm 2.4.103 5dea8f56ee620e9a3ace34a99ebf0175efb5 https://github.com/freedesktop/mesa- 7b11 drm.git Mesa 21.1.0-dev 38f012e0238f145f4c83bf7abf59afceee333 https://github.com/mesa3d/mesa.git 397 Ddx 19.1.0 6234a1b2652f469071c0c9b0d8b0f4a8079e https://github.com/freedesktop/xorg- fe74 xf86-video-amdgpu.git Gstomx 1.0.0.1 5c4bff4a433dff1c5d005edfceaf727b6214b git://people.freedesktop.org/~leoliu/gsto b74 mx Wayland 1.15.0 ea09c2fde7fcfc7e24a19ae5c5977981e9bef
    [Show full text]
  • Radeon GPU Profiler Documentation
    Radeon GPU Profiler Documentation Release 1.11.0 AMD Developer Tools Jul 21, 2021 Contents 1 Graphics APIs, RDNA and GCN hardware, and operating systems3 2 Compute APIs, RDNA and GCN hardware, and operating systems5 3 Radeon GPU Profiler - Quick Start7 3.1 How to generate a profile.........................................7 3.2 Starting the Radeon GPU Profiler....................................7 3.3 How to load a profile...........................................7 3.4 The Radeon GPU Profiler user interface................................. 10 4 Settings 13 4.1 General.................................................. 13 4.2 Themes and colors............................................ 13 4.3 Keyboard shortcuts............................................ 14 4.4 UI Navigation.............................................. 16 5 Overview Windows 17 5.1 Frame summary (DX12 and Vulkan).................................. 17 5.2 Profile summary (OpenCL)....................................... 20 5.3 Barriers.................................................. 22 5.4 Context rolls............................................... 25 5.5 Most expensive events.......................................... 28 5.6 Render/depth targets........................................... 28 5.7 Pipelines................................................. 30 5.8 Device configuration........................................... 33 6 Events Windows 35 6.1 Wavefront occupancy.......................................... 35 6.2 Event timing............................................... 48 6.3
    [Show full text]