Realtime Computer Graphics on Gpus Speedup Techniques, Other Apis

Total Page:16

File Type:pdf, Size:1020Kb

Realtime Computer Graphics on Gpus Speedup Techniques, Other Apis Optimizations Intro Optimize Rendering Textures Other APIs Realtime Computer Graphics on GPUs Speedup Techniques, Other APIs Jan Kolomazn´ık Department of Software and Computer Science Education Faculty of Mathematics and Physics Charles University in Prague 1 / 34 Optimizations Intro Optimize Rendering Textures Other APIs Optimizations Intro 2 / 34 Optimizations Intro Optimize Rendering Textures Other APIs PERFORMANCE BOTTLENECKS I I Most of the applications require steady framerate – What can slow down rendering? I Too much geometry rendered I CPU/GPU can process only limited amount of data per second I Render only what is visible and with adequate details I Lighting/shading computation I Use simpler material model I Limit number of generated fragments I Data transfers between CPU/GPU I Try to reuse/cache data on GPU I Use async transfers 3 / 34 Optimizations Intro Optimize Rendering Textures Other APIs PERFORMANCE BOTTLENECKS II I State changes I Bundle object by materials I Use UBOs (uniform buffer objects) I GPU idling – cannot generate work fast enough I Multithreaded task generation I Not everything must be done in every frame – reuse information (temporal consistency) I CPU/Driver hotspots I Bindless textures I Instanced rendering I Indirect rendering 4 / 34 Optimizations Intro Optimize Rendering Textures Other APIs DIFFERENT NEEDS I Large open world I Rendering lots of objects I Not so many details needed for distant sections I Indoors scenes I Often lots of same objects – instaced rendering I Only small portion of the scene visible – occluded by walls, . I CAD, molecular biology, . I All geometry visible at once I Lots of self-occlusions I Switching levels of detail 5 / 34 Optimizations Intro Optimize Rendering Textures Other APIs PROFILING I Three rules of profiling: I Measure! I Measure! I Measure! I How to measure? I Code instrumentation I Sampling profilers I HW event counters I Profiling influences performance I Tools for GPU: I NVIDIA Nsight I RenderDoc I CodeXL 6 / 34 Optimizations Intro Optimize Rendering Textures Other APIs SCENE GRAPH I Boundary volume hierarchies (BVH) I Spatial partitioning I Binary space partitioning I Quadtrees, octrees I Grids I Additional meta-objects: I Occluders I Collision geometry 7 / 34 Optimizations Intro Optimize Rendering Textures Other APIs STANDARD RENDERING PROCEDURE 1. Analyze world view 2. Transfer data on GPU I Upload/reuse geometry in VBOs I Upload/reuse textures 3. Issue draw commands 8 / 34 Optimizations Intro Optimize Rendering Textures Other APIs Optimize Rendering 9 / 34 Optimizations Intro Optimize Rendering Textures Other APIs QUERY OBJECTS I How to use: glGenQueries(GLsizei n, GLuint * ids); glBeginQuery(GLenum target, GLuint id); // ... glEndQuery(GLenum target); I Example queries: GL_SAMPLES_PASSED GL_ANY_SAMPLES_PASSED GL_PRIMITIVES_GENERATED ... I Usualy used with one frame lag 10 / 34 Optimizations Intro Optimize Rendering Textures Other APIs OCCLUSION CULLING I Do not render objects hidden behind others I Helper objects – occluders I CPU processing I Analyze scene graph + occluders to filter rendered geometry I GPU processing I Z-buffer pre-render I Render occluders to Z-buffer I Occlusion queries I Temporal consistency – Z-buffer reprojection 11 / 34 Optimizations Intro Optimize Rendering Textures Other APIs CONDITIONAL RENDERING I GL version is 3.0 or greater I Render cheap object I Use occlusion query to see if any of it is visible I If it isn’t, skip rendering of expensive object glBeginConditionalRender(GLuint id, GLenum mode); glEndConditionalRender(); 12 / 34 Optimizations Intro Optimize Rendering Textures Other APIs Z-BUFFER OPTIMIZATIONS I Early Z-test: I Fragment shader cannot modify depth value I Front-to-back rendering (rough sort) I Double speed Z-only: I First pass depth (stencil) only, write z-buffer – faster when no color buffer present I Second pass color writes, z-buffer read-only I Optimizations: I Z-pass for major occluders I Deffered shading 13 / 34 Optimizations Intro Optimize Rendering Textures Other APIs BACKFACE CULLING I Good closed meshes I No need for triangle back-side rasterization I GPU can filter (face cull) according to vertex order: I glEnable( GL CULL FACE ); I glFrontFace( GL CCW ); I glCullFace( GL BACK ); // draw front faces only 14 / 34 Optimizations Intro Optimize Rendering Textures Other APIs CLIPPING PLANES I Objects outside of the view frustum are not rendered I Set far-plane reasonable close (also increases z-buffer precision) I Harsh end of all geometry I Render background geometry I Hide the far plane in fog (glFog()) I More sophisticated effect in fragment shader/deffered shading 15 / 34 Optimizations Intro Optimize Rendering Textures Other APIs LEVEL OF DETAIL (LOD) I Optimal rendering efficiency: I details of distant objects ( pixel size) need not be drawn I the closest objects (and/or user focus) need the best available visual quality I dynamic level of detail I rendering system adjusts individual rendering accuracy I global parameter definition (e.g. total approximate number of drawn triangles) I advance data preparation: discrete LoD levels / continuous LoD pre-processing.. 16 / 34 Optimizations Intro Optimize Rendering Textures Other APIs DISCRETE LOD I fixed LoD levels are prepared in advance I shape approximations with different accuracy I can come from the finest model – generalization (can be time consuming) I rendering – choosing appropriate LoD level: I according to object-camera distance I according to bounding object projection size or even exact object projection – errors perpendicular to viewing direction are most noticeable I object importance, “player focus”, .. I global balancing (declared number of triangles to draw) 17 / 34 Optimizations Intro Optimize Rendering Textures Other APIs LODTRANSITIONS I simple switching I hysteresis must be introduced (against unwanted ”popping”) I blending neighboring LoD levels I drawing both neighboring levels using transparency I linear combination (blending, “transition”) I another option I current LoD level is opaque I additional semitransparent new LoD (“over” a operation) I “z-writing” enabled only for the current LoD level 18 / 34 Optimizations Intro Optimize Rendering Textures Other APIs GENERATING LODS I top-down approach I based on the simplest shape representation (low-poly), additional details are introduced I less frequent (subdivision surfaces..) I bottom-up I starts with detailed (most accurate) model I gradual simplification / generalization (data reduction) 19 / 34 Optimizations Intro Optimize Rendering Textures Other APIs TRIANGLE STRIPS/FANS v0 v0 v1 v1 v1 v2 v2 v2 v0 v3 v3 v3 v5 v5 v4 v4 v4 v5 Figure: Figure: Figure: GL TRIANGLES GL TRIANGLE STRIP GL TRIANGLE FAN 20 / 34 Optimizations Intro Optimize Rendering Textures Other APIs LIMIT DRAW CALLS I Instanced rendering I Multidraw commands – bundle more indexed render calls together void glMultiDrawElements(GLenum mode, const GLsizei * count, GLenum type, const GLvoid *indices, GLsizei drawcount); I Indirect rendering I GL DRAW INDIRECT BUFFER I Fill on GPU void glDrawElementsIndirect() void glDrawArraysIndirect() void glMultiDrawElementsIndirect() 21 / 34 Optimizations Intro Optimize Rendering Textures Other APIs MULTITHREADING I Contexts are not thread-safe I Context can be current only in one thread I It can be used in multiple threads, but not in parallel I Multiple contexts can be used concurrently I Enable resource sharing – textures, VBOs, . I Split preparation work between multiple threads (task systems) 22 / 34 Optimizations Intro Optimize Rendering Textures Other APIs Textures 23 / 34 Optimizations Intro Optimize Rendering Textures Other APIs TEXTURE ACCESS I Texture accesses expensive I Limit number of tex reads per fragment I Use reasonable sized textures I Use mip-mapping I Neighboring fragments should access neighboring texels I Textures have spatial caching 24 / 34 Optimizations Intro Optimize Rendering Textures Other APIs TEXTURE COMPRESSION I I Invention: S3 in DirectX 6 (1998) I DXTC, in OpenGL: S3TC, DXT1 I Fixed compression ratio I necessary for memory management I 4:1 to 6:1 lossy compression I Decomposition into rectangle tiles (4 × 4 px) I each tile: two 16-bit colors and sixteen 2-bit indices (together – 4 bpp) I two extreme colors (R5G6B5), two more in-between colors (or one in-between and black) I each pixel is represented by a reference to one color I Extensions add alpha compression (128-bit): DXT3, DXT5 25 / 34 Optimizations Intro Optimize Rendering Textures Other APIs TEXTURE COMPRESSION II I NVIDIA VTC (Volume Texture Compression) I 3D variant of S3TC I data blocks 4 × 4 × 1, 4 × 4 × 2, 4 × 4 × 4 I BPTC Texture Compression: I Byte stream I Unsigned + float I Multiple gradients 26 / 34 Optimizations Intro Optimize Rendering Textures Other APIs Other APIs 27 / 34 Optimizations Intro Optimize Rendering Textures Other APIs OPENGL ES I OpenGL for Embedded Systems (smartphones, tablet computers, video game consoles) I Most widely deployed 3D graphics API in history I Subset of the OpenGL API I Latest version of OpenGL ES 3.2 I Adds tesselation and geometry shaders 28 / 34 Optimizations Intro Optimize Rendering Textures Other APIs WEBGL I JavaScript API for rendering interactive 2D and 3D graphics within any compatible web browser without the use of plug-ins I Control code written in JavaScript and shader code that is written in OpenGL ES Shading Language I WebGL 1.0 is based on OpenGL ES 2.0, HTML5 canvas element I WebGL 2.0 is based on OpenGL ES 3.0 I WebAssembly I Compile
Recommended publications
  • Amd Filed: February 24, 2009 (Period: December 27, 2008)
    FORM 10-K ADVANCED MICRO DEVICES INC - amd Filed: February 24, 2009 (period: December 27, 2008) Annual report which provides a comprehensive overview of the company for the past year Table of Contents 10-K - FORM 10-K PART I ITEM 1. 1 PART I ITEM 1. BUSINESS ITEM 1A. RISK FACTORS ITEM 1B. UNRESOLVED STAFF COMMENTS ITEM 2. PROPERTIES ITEM 3. LEGAL PROCEEDINGS ITEM 4. SUBMISSION OF MATTERS TO A VOTE OF SECURITY HOLDERS PART II ITEM 5. MARKET FOR REGISTRANT S COMMON EQUITY, RELATED STOCKHOLDER MATTERS AND ISSUER PURCHASES OF EQUITY SECURITIES ITEM 6. SELECTED FINANCIAL DATA ITEM 7. MANAGEMENT S DISCUSSION AND ANALYSIS OF FINANCIAL CONDITION AND RESULTS OF OPERATIONS ITEM 7A. QUANTITATIVE AND QUALITATIVE DISCLOSURE ABOUT MARKET RISK ITEM 8. FINANCIAL STATEMENTS AND SUPPLEMENTARY DATA ITEM 9. CHANGES IN AND DISAGREEMENTS WITH ACCOUNTANTS ON ACCOUNTING AND FINANCIAL DISCLOSURE ITEM 9A. CONTROLS AND PROCEDURES ITEM 9B. OTHER INFORMATION PART III ITEM 10. DIRECTORS, EXECUTIVE OFFICERS AND CORPORATE GOVERNANCE ITEM 11. EXECUTIVE COMPENSATION ITEM 12. SECURITY OWNERSHIP OF CERTAIN BENEFICIAL OWNERS AND MANAGEMENT AND RELATED STOCKHOLDER MATTERS ITEM 13. CERTAIN RELATIONSHIPS AND RELATED TRANSACTIONS AND DIRECTOR INDEPENDENCE ITEM 14. PRINCIPAL ACCOUNTANT FEES AND SERVICES PART IV ITEM 15. EXHIBITS, FINANCIAL STATEMENT SCHEDULES SIGNATURES EX-10.5(A) (OUTSIDE DIRECTOR EQUITY COMPENSATION POLICY) EX-10.19 (SEPARATION AGREEMENT AND GENERAL RELEASE) EX-21 (LIST OF AMD SUBSIDIARIES) EX-23.A (CONSENT OF ERNST YOUNG LLP - ADVANCED MICRO DEVICES) EX-23.B
    [Show full text]
  • Candidate Features for Future Opengl 5 / Direct3d 12 Hardware and Beyond 3 May 2014, Christophe Riccio
    Candidate features for future OpenGL 5 / Direct3D 12 hardware and beyond 3 May 2014, Christophe Riccio G-Truc Creation Table of contents TABLE OF CONTENTS 2 INTRODUCTION 4 1. DRAW SUBMISSION 6 1.1. GL_ARB_MULTI_DRAW_INDIRECT 6 1.2. GL_ARB_SHADER_DRAW_PARAMETERS 7 1.3. GL_ARB_INDIRECT_PARAMETERS 8 1.4. A SHADER CODE PATH PER DRAW IN A MULTI DRAW 8 1.5. SHADER INDEXED LOSE STATES 9 1.6. GL_NV_BINDLESS_MULTI_DRAW_INDIRECT 10 1.7. GL_AMD_INTERLEAVED_ELEMENTS 10 2. RESOURCES 11 2.1. GL_ARB_BINDLESS_TEXTURE 11 2.2. GL_NV_SHADER_BUFFER_LOAD AND GL_NV_SHADER_BUFFER_STORE 11 2.3. GL_ARB_SPARSE_TEXTURE 12 2.4. GL_AMD_SPARSE_TEXTURE 12 2.5. GL_AMD_SPARSE_TEXTURE_POOL 13 2.6. SEAMLESS TEXTURE STITCHING 13 2.7. 3D MEMORY LAYOUT FOR SPARSE 3D TEXTURES 13 2.8. SPARSE BUFFER 14 2.9. GL_KHR_TEXTURE_COMPRESSION_ASTC 14 2.10. GL_INTEL_MAP_TEXTURE 14 2.11. GL_ARB_SEAMLESS_CUBEMAP_PER_TEXTURE 15 2.12. DMA ENGINES 15 2.13. UNIFIED MEMORY 16 3. SHADER OPERATIONS 17 3.1. GL_ARB_SHADER_GROUP_VOTE 17 3.2. GL_NV_SHADER_THREAD_GROUP 17 3.3. GL_NV_SHADER_THREAD_SHUFFLE 17 3.4. GL_NV_SHADER_ATOMIC_FLOAT 18 3.5. GL_AMD_SHADER_ATOMIC_COUNTER_OPS 18 3.6. GL_ARB_COMPUTE_VARIABLE_GROUP_SIZE 18 3.7. MULTI COMPUTE DISPATCH 19 3.8. GL_NV_GPU_SHADER5 19 3.9. GL_AMD_GPU_SHADER_INT64 20 3.10. GL_AMD_GCN_SHADER 20 3.11. GL_NV_VERTEX_ATTRIB_INTEGER_64BIT 21 3.12. GL_AMD_ SHADER_TRINARY_MINMAX 21 4. FRAMEBUFFER 22 4.1. GL_AMD_SAMPLE_POSITIONS 22 4.2. GL_EXT_FRAMEBUFFER_MULTISAMPLE_BLIT_SCALED 22 4.3. GL_NV_MULTISAMPLE_COVERAGE AND GL_NV_FRAMEBUFFER_MULTISAMPLE_COVERAGE 22 4.4. GL_AMD_DEPTH_CLAMP_SEPARATE 22 5. BLENDING 23 5.1. GL_NV_TEXTURE_BARRIER 23 5.2. GL_EXT_SHADER_FRAMEBUFFER_FETCH (OPENGL ES) 23 5.3. GL_ARM_SHADER_FRAMEBUFFER_FETCH (OPENGL ES) 23 5.4. GL_ARM_SHADER_FRAMEBUFFER_FETCH_DEPTH_STENCIL (OPENGL ES) 23 5.5. GL_EXT_PIXEL_LOCAL_STORAGE (OPENGL ES) 24 5.6. TILE SHADING 25 5.7. GL_INTEL_FRAGMENT_SHADER_ORDERING 26 5.8. GL_KHR_BLEND_EQUATION_ADVANCED 26 5.9.
    [Show full text]
  • AMD Firepro™ W5000
    AMD FirePro™ W5000 Be Limitless, When Every Detail Counts. Powerful mid-range workstation graphics. This powerful product, designed for delivering superior performance for CAD/CAE and Media workflows, can process Key Features: up to 1.65 billion triangles per second. This means during > Utilizes Graphics Core Next (GCN) to the design process you can easily interact and render efficiently balance compute tasks with your 3D models, while the competition can only process 3D workloads, enabling multi-tasking that is designed to optimize utilization up to 0.41 billion triangles per second (up to four times and maximize performance. less performance). It also offers double the memory > Unmatched application of competing products (2GB vs. 1GB) and 2.5x responsiveness in your workflow, the memory bandwidth. It’s the ideal solution whether in advanced visualization, for professionals working with a broad range of complex models, large data sets or applications, moderately complex models and datasets, video editing. and advanced visual effects. > AMD ZeroCore Power Technology enables your GPU to power down when your monitor is off. Product features: > AMD ZeroCore Power technology leverages > GeometryBoost—the GPU processes > Optimized and certified for major CAD and M&E AMD’s leadership in notebook power efficiency geometry data at a rate of twice per clock cycle, doubling the rate of primitive applications delivering 1 TFLOP of single precision and 80 to enable our desktop GPUs to power down and vertex processing. GFLOPs of double precision performance with when your monitor is off, also known as the > AMD Eyefinity Technology— outstanding reliability for the most demanding “long idle state.” Industry-leading multi-display professional tasks.
    [Show full text]
  • AMD Firepro™Professional Graphics for CAD & Engineering and Media & Entertainment
    AMD FirePro™Professional Graphics for CAD & Engineering and Media & Entertainment Performance at every price point. AMD FirePro professional graphics offer breakthrough capabilities that can help maximize productivity and help lower cost and complexity — giving you the edge you need in your business. Outstanding graphics performance, compute power and ultrahigh-resolution multidisplay capabilities allows broadcast, design and engineering professionals to work at a whole new level of detail, speed, responsiveness and creativity. AMD FireProTM W9100 AMD FireProTM W8100 With 16GB GDDR5 memory and the ability to support up to six 4K The new AMD FirePro W8100 workstation graphics card is based on displays via six Mini DisplayPort outputs,1 the AMD FirePro W9100 the AMD Graphics Core Next (GCN) GPU architecture and packs up graphics card is the ideal single-GPU solution for the next generation to 4.2 TFLOPS of compute power to accelerate your projects beyond of ultrahigh-resolution visualization environments. just graphics. AMD FireProTM W7100 AMD FireProTM W5100 The new AMD FirePro W7100 graphics card delivers 8GB The new AMD FirePro™ W5100 graphics card delivers optimized of memory, application performance and special features application and multidisplay performance for midrange users. that media and entertainment and design and engineering With 4GB of ultra-fast GDDR5 memory, users can tackle moderately professionals need to take their projects to the next level. complex models, assemblies, data sets or advanced visual effects with ease. AMD FireProTM W4100 AMD FireProTM W2100 In a class of its own, the AMD FirePro Professional graphics starts with AMD W4100 graphics card is the best choice FirePro W2100 graphics, delivering for entry-level users who need a boost in optimized and certified professional graphics performance to better address application performance that similarly- their evolving workflows.
    [Show full text]
  • Comparison of Technologies for General-Purpose Computing on Graphics Processing Units
    Master of Science Thesis in Information Coding Department of Electrical Engineering, Linköping University, 2016 Comparison of Technologies for General-Purpose Computing on Graphics Processing Units Torbjörn Sörman Master of Science Thesis in Information Coding Comparison of Technologies for General-Purpose Computing on Graphics Processing Units Torbjörn Sörman LiTH-ISY-EX–16/4923–SE Supervisor: Robert Forchheimer isy, Linköpings universitet Åsa Detterfelt MindRoad AB Examiner: Ingemar Ragnemalm isy, Linköpings universitet Organisatorisk avdelning Department of Electrical Engineering Linköping University SE-581 83 Linköping, Sweden Copyright © 2016 Torbjörn Sörman Abstract The computational capacity of graphics cards for general-purpose computing have progressed fast over the last decade. A major reason is computational heavy computer games, where standard of performance and high quality graphics con- stantly rise. Another reason is better suitable technologies for programming the graphics cards. Combined, the product is high raw performance devices and means to access that performance. This thesis investigates some of the current technologies for general-purpose computing on graphics processing units. Tech- nologies are primarily compared by means of benchmarking performance and secondarily by factors concerning programming and implementation. The choice of technology can have a large impact on performance. The benchmark applica- tion found the difference in execution time of the fastest technology, CUDA, com- pared to the slowest, OpenCL, to be twice a factor of two. The benchmark applica- tion also found out that the older technologies, OpenGL and DirectX, are compet- itive with CUDA and OpenCL in terms of resulting raw performance. iii Acknowledgments I would like to thank Åsa Detterfelt for the opportunity to make this thesis work at MindRoad AB.
    [Show full text]
  • AMD Codexl 1.7 GA Release Notes
    AMD CodeXL 1.7 GA Release Notes Thank you for using CodeXL. We appreciate any feedback you have! Please use the CodeXL Forum to provide your feedback. You can also check out the Getting Started guide on the CodeXL Web Page and the latest CodeXL blog at AMD Developer Central - Blogs This version contains: For 64-bit Windows platforms o CodeXL Standalone application o CodeXL Microsoft® Visual Studio® 2010 extension o CodeXL Microsoft® Visual Studio® 2012 extension o CodeXL Microsoft® Visual Studio® 2013 extension o CodeXL Remote Agent For 64-bit Linux platforms o CodeXL Standalone application o CodeXL Remote Agent Note about installing CodeAnalyst after installing CodeXL for Windows AMD CodeAnalyst has reached End-of-Life status and has been replaced by AMD CodeXL. CodeXL installer will refuse to install on a Windows station where AMD CodeAnalyst is already installed. Nevertheless, if you would like to install CodeAnalyst, do not install it on a Windows station already installed with CodeXL. Uninstall CodeXL first, and then install CodeAnalyst. System Requirements CodeXL contains a host of development features with varying system requirements: GPU Profiling and OpenCL Kernel Debugging o An AMD GPU (Radeon HD 5000 series or newer, desktop or mobile version) or APU is required. o The AMD Catalyst Driver must be installed, release 13.11 or later. Catalyst 14.12 (driver 14.501) is the recommended version. See "Getting the latest Catalyst release" section below. For GPU API-Level Debugging, a working OpenCL/OpenGL configuration is required (AMD or other). CPU Profiling o Time-Based Profiling can be performed on any x86 or AMD64 (x86-64) CPU/APU.
    [Show full text]
  • Media Kit 2019 Technology for Optimal Engineering Design
    MEDIA KIT 2019 TECHNOLOGY FOR OPTIMAL ENGINEERING DESIGN DigitalEngineering247.com Who We Are From the Publisher DE’S MISSION Welcome to Digital Engineering, a B2B media business dedicated to technology for optimal engineering The design engineer is in the center of a design. We have a dedicated following of over 88,000 design engineers and engineering & IT managers at the never-ending cycle of improvement that very front end of product design. DE’s targeted audience consumes our cutting-edge editorial content that is provided in the form of a magazine (print & digital formats), five e-newsletters, editorial webcasts and our relies on the integration of multiple redesigned website at www.DigitalEngineering247.com technologies, multi-disciplinary engineering Unlike other design engineering titles, we do not try to be all things to all people. We are not a component teams, and the collection and dissemination magazine. We focus on the technologies that drive better designs. We cover the following: of design, simulation, test and market • design software data. In addition to using the best available • simulation & analysis software technology to optimize each stage of the • prototyping options (additive/subtractive/short-run manufacturing) • test & measurement cycle, a fully optimized workflow enhances • IoT/sensors to communicate the data communication and collaboration along a • computer options (workstations/HPC/Cloud) to run the software and manage the data digital thread that connects the engineering • workflow software to communicate the design process throughout the entire lifecycle via the digital thread. team, colleagues in other departments, Please see the specific departments and product coverage on the next page.
    [Show full text]
  • GPU Rigid Body Simulation Using Opencl Erwin Coumans
    GPU rigid body simulation using OpenCL Erwin Coumans, http://bulletphysics.org Introduction This document discusses our efforts to rewrite our rigid body simulator, Bullet 3.x, to make it suitable for many-core systems, such as GPUs, using OpenCL. Although OpenCL is thought, most of it can be applied to projects using other GPU compute languages, such as NVIDIA CUDA and Microsoft DX11 Direct Compute. Bullet is physics simulation software, in particular for rigid body dynamics and collision detection in. The project started around 2003, as in-house physics engine for a Playstation 2 game. In 2005 it was made publically available as open source under a liberal zlib license. Since then it is being used by game developers, movie studios and 3d modelers and authoring tools such as Maya, Blender, Cinema 4D etc. Before the rewrite, we have been optimizing and refactoring Bullet 2.x for multi-code, and we’ll briefly touch on those efforts. Previously we have worked on simplified GPU rigid body simulation, such as [Harada 2010] and [Coumans 2009]. Our recent GPU rigid body dynamics work has approached the same quality compared to the CPU version. The Bullet 3.x rigid body and collision detection pipeline runs 100% on the GPU using OpenCL. On a high- end desktop GPU it can simulate 100 thousand rigid bodies in real-time. The source code is available as open source at http://github.com/erwincoumans/bullet3 . Appendix B shows how to build and use the project on Windows, Linux and Mac OSX. Bullet 2.x Refactoring Bullet 2.x is written in modular C++ and its API was initially designed to be flexible and extendible, rather than optimized for speed.
    [Show full text]
  • Club 3D Radeon R9 380 Royalqueen 4096MB GDDR5 256BIT | PCI EXPRESS 3.0
    Club 3D Radeon R9 380 royalQueen 4096MB GDDR5 256BIT | PCI EXPRESS 3.0 Product Name Club 3D Radeon R9 380 4GB royalQueen 4096MB GDDR5 256 BIT | PCI Express 3.0 Product Series Club 3D Radeon R9 300 Series codename ‘Antigua’ Itemcode CGAX-R93858 EAN code 8717249401469 UPC code 854365005428 Description: OS Support: The new Club 3D Radeon™ R9 380 4GB royalQueen was conceived to OS Support: Windows 7, Windows 8.1, Windows 10 play hte most demanding games at 1080p, 1440p, all the way up to 4K 3D API Support: DirextX 11.2, DirectX 12, Vulkan, AMD Mantle. resolution. Get quality that rivals 4K, even on 1080p displays thanks to VSR (Virtual Super Resolution). Loaded with the latest advancements in GCN architecture including AMD FreeSync™, AMD Eyefinity and AMD In the box: LiquidVR™ technologies plus support for the nex gen APIs DirectX® 12, • Club 3D R9 380 royalQueen Graphics card Vulkan™ and AMD mantle, the Club 3D R9 380 royalQueen is for the • Club 3D Driver & E-manual CD serious PC Gamer. • Club 3D gaming Door hanger • Quick install guide Features: Outputs: Other info: • Club 3D Radeon R9 380 royalQueen 4GB • DisplayPort 1.2a • Box size: 293 x 195 x 69 mm • 1792 Stream Processors • HDMI 1.4a • Card size: 207 x 111 x 38 mm • Clock speed up to 980 MHz • Dual Link DVI-I • Weight: 0.6 Kg • 4096 MB GDDR5 Memory at 5900MHz • Dual Link DVI-D • Profile: Standard profile • 256 BIT Memory Bus • Slot width: 2 Slots • High performance Dual Fan CoolStream cooler • Requires min 700w PSU with • PCI Express 3.0 two 6pin PCIe connectors • AMD Eyefinity 6 capable (with Club 3D MST Hub) with PLP support Outputs: • AMD Graphics core Next architecture • AMD PowerTune • AMD ZeroCore Power • AMD FreeSync support • AMD Bridgeless CrossFire • Custom backplate Quick install guide: PRODUCT LINK CLICK HERE Disclaimer: While we endeavor to provide the most accurate, up-to-date information available, the content on this document may be out of date or include omissions, inaccuracies or other errors.
    [Show full text]
  • SAPPHIRE R9 285 2GB GDDR5 ITX COMPACT OC Edition (UEFI)
    Specification Display Support 4 x Maximum Display Monitor(s) support 1 x HDMI (with 3D) Output 2 x Mini-DisplayPort 1 x Dual-Link DVI-I 928 MHz Core Clock GPU 28 nm Chip 1792 x Stream Processors 2048 MB Size Video Memory 256 -bit GDDR5 5500 MHz Effective 171(L)X110(W)X35(H) mm Size. Dimension 2 x slot Driver CD Software SAPPHIRE TriXX Utility DVI to VGA Adapter Mini-DP to DP Cable Accessory HDMI 1.4a high speed 1.8 meter cable(Full Retail SKU only) 1 x 8 Pin to 6 Pin x2 Power adaptor Overview HDMI (with 3D) Support for Deep Color, 7.1 High Bitrate Audio, and 3D Stereoscopic, ensuring the highest quality Blu-ray and video experience possible from your PC. Mini-DisplayPort Enjoy the benefits of the latest generation display interface, DisplayPort. With the ultra high HD resolution, the graphics card ensures that you are able to support the latest generation of LCD monitors. Dual-Link DVI-I Equipped with the most popular Dual Link DVI (Digital Visual Interface), this card is able to display ultra high resolutions of up to 2560 x 1600 at 60Hz. Advanced GDDR5 Memory Technology GDDR5 memory provides twice the bandwidth per pin of GDDR3 memory, delivering more speed and higher bandwidth. Advanced GDDR5 Memory Technology GDDR5 memory provides twice the bandwidth per pin of GDDR3 memory, delivering more speed and higher bandwidth. AMD Stream Technology Accelerate the most demanding applications with AMD Stream technology and do more with your PC. AMD Stream Technology allows you to use the teraflops of compute power locked up in your graphics processer on tasks other than traditional graphics such as video encoding, at which the graphics processor is many, many times faster than using the CPU alone.
    [Show full text]
  • Porting Source to Linux
    Porting Source to Linux Valve’s Lessons Learned Overview . Who is this talk for? . Why port? . Windows->Linux . Linux Tools . Direct3D->OpenGL Why port? 100% Why port? Nov Dec Jan Feb . Linux is open 10% . Linux (for gaming) is growing, and quickly 1% . Stepping stone to mobile . Performance 0% . Steam for Linux Linux Mac Windows % December January February Windows 94.79 94.56 94.09 Mac 3.71 3.56 3.07 Linux 0.79 1.12 2.01 Why port? – cont’d . GL exposes functionality by hardware capability—not OS. China tends to have equivalent GPUs, but overwhelmingly still runs XP — OpenGL can allow DX10/DX11 (and beyond) features for all of those users Why port? – cont’d . Specifications are public. GL is owned by committee, membership is available to anyone with interest (and some, but not a lot, of $). GL can be extended quickly, starting with a single vendor. GL is extremely powerful Windows->Linux Windowing issues . Consider SDL! . Handles all cross-platform windowing issues, including on mobile OSes. Tight C implementation—everything you need, nothing you don’t. Used for all Valve ports, and Linux Steam http://www.libsdl.org/ Filesystem issues . Linux filesystems are case-sensitive . Windows is not . Not a big issue for deployment (because everyone ships packs of some sort) . But an issue during development, with loose files . Solution 1: Slam all assets to lower case, including directories, then tolower all file lookups (only adjust below root) . Solution 2: Build file cache, look for similarly named files Other issues . Bad Defines — E.g. Assuming that LINUX meant DEDICATED_SERVER .
    [Show full text]
  • AMD Codexl 1.8 GA Release Notes
    AMD CodeXL 1.8 GA Release Notes Contents AMD CodeXL 1.8 GA Release Notes ......................................................................................................... 1 New in this version .............................................................................................................................. 2 System Requirements .......................................................................................................................... 2 Getting the latest Catalyst release ....................................................................................................... 4 Note about installing CodeAnalyst after installing CodeXL for Windows ............................................... 4 Fixed Issues ......................................................................................................................................... 4 Known Issues ....................................................................................................................................... 5 Support ............................................................................................................................................... 6 Thank you for using CodeXL. We appreciate any feedback you have! Please use the CodeXL Forum to provide your feedback. You can also check out the Getting Started guide on the CodeXL Web Page and the latest CodeXL blog at AMD Developer Central - Blogs This version contains: For 64-bit Windows platforms o CodeXL Standalone application o CodeXL Microsoft® Visual Studio®
    [Show full text]