Real-Time Rendering
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Cardinality-Constrained Texture Filtering
To appear in ACM TOG 32(4). Cardinality-Constrained Texture Filtering Josiah Manson Scott Schaefer Texas A&M University Texas A&M University (a) Input (b) Exact (c) 8 Texels (d) Trilinear Figure 1: We show a comparison of Lanczos´ 2 filter approximations showing (a) the 10242 input image, with downsampled images at a resolution of 892 calculated using (b) an exact Lanczos´ filter, (c) an eight texel approximation using our method, and (d) trilinear interpolation applied to a Lanczos´ filtered mipmap. Our approximation produces an image that is nearly the same as the exact filtered image while using the same number of texels as trilinear interpolation. Abstract theorem [Shannon 1949] implies that we must use a low-pass filter to remove high-frequency data from the image prior to sampling. We present a method to create high-quality sampling filters by com- There are a variety of low-pass filters, where each filter has its own bining a prescribed number of texels from several resolutions in a set of tradeoffs. Some filters remove aliasing at the cost of overblur- mipmap. Our technique provides fine control over the number of ring the image, while others blur less but allow more aliasing. Fil- texels we read per texture sample so that we can scale quality to ters that are effective at removing aliasing without overblurring sum match a memory bandwidth budget. Our method also has a fixed over a greater number of texels, which makes them expensive to cost regardless of the filter we approximate, which makes it fea- compute. As an extreme example, the sinc filter removes all high sible to approximate higher-order filters such as a Lanczos´ 2 filter frequencies and no low frequencies, but sums over an infinite num- in real-time rendering. -
NVIDIA GPU Programming Guide
Version 2.4.0 1 Notice ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation. Trademarks NVIDIA, the NVIDIA logo, GeForce, and NVIDIA Quadro are registered trademarks of NVIDIA Corporation. Other company and product names may be trademarks of the respective companies with which they are associated. Copyright © 2005 by NVIDIA Corporation. All rights reserved. HISTORY OF MAJOR REVISIONS Version Date Changes 2.4.0 07/08/2005 Updated cover Added GeForce 7 Series content 2.3.0 02/08/2005 Added 2D & Video Programming chapter Added more SLI information 2.2.1 11/23/2004 Minor formatting improvements 2.2.0 11/16/2004 Added normal map format advice Added ps_3_0 performance advice Added General Advice chapter 2.1.0 07/20/2004 Added Stereoscopic Development chapter 2.0.4 07/15/2004 Updated MRT section 2.0.3 06/25/2004 Added Multi-GPU Support chapter 2 NVIDIA GPU Programming Guide Table of Contents Chapter 1. -
Deferred Rendering Using Compute Shaders
Deferred rendering using Compute shaders A comparative study using shader model 4.0 and 5.0 Benjamin Golba 1 | P a g e This thesis is submitted to the Department of Interaction and System Design at Blekinge Institute of Technology in partial fulfillment of the requirements for the Bachelor degree in Computer Science. The thesis is equivalent to 10 weeks of full time studies. Contact Information: Author: Benjamin Golba Address: Folkparksvägen 10:17, 372 40 Ronneby E-mail: [email protected] University advisor: Stefan Petersson Department of Software Engineering and Computer Science Address: Soft Center, RONNEBY Phone: +46 457 38 58 15 Department of Interaction and System Design Blekinge Institute of Technology SE - 372 25 RONNEBY Sweden Internet: http://www.bth.se/tek/ais Phone: +46 457 38 58 00 Fax: +46 457 271 25 2 | P a g e Abstract Game developers today are putting a lot of effort into their games. Consumers are hard to please and demand a game which can provide both fun and visual quality. This is why developers aim to make the most use of what hardware resources are available to them to achieve the best possible quality of the game. It is easy to use too many performance demanding techniques in a game, making the game unplayable. The hard part is to make the game look good without decreasing the performance. This can be done by using techniques in a smart way to make the graphics as smooth and efficient as they can be without compromising the visual quality. One of these techniques is deferred rendering. -
Computer Graphics Texture Filtering & Sampling Theory
Computer Graphics Texture Filtering Philipp Slusallek Reconstruction Filter • Simple texture mapping in a ray-tracer – Ray hits surface, e.g. a triangle – Each triangle vertex also has an arbitrary texture coordinate • Map this vertex into 2D texture space (aka. texture parameterization) – Use barycentric coordinates to map hit point into texture space • Hit point generally does not exactly hit a texture sample • Use reconstruction filter to find color for hit point Texture Space 2 Nearest Neighbor “Interpolation” v c2 c3 c0 c1 u Texture Space • How to compute the color of the pixel? – Choose the closest texture sample • Rounding of the texture coordinate in texture space • c = tex[ min( u * resU , resU – 1 ) , min( v * resV , resV – 1 ) ]; 3 Bilinear Interpolation v c2 c3 1-t t c0 c1 u st 1-s Texture Space • How to compute the color of the pixel? – Interpolate between surrounding four pixels – c = (1-t) (1-s) c0 + (1-t) s c1 + t (1-s) c2 + t s c3 4 Bilinear Interpolation v c2 c3 1-t t c0 c1 u st 1-s Texture Space • Can be done in two steps: – c = (1-t) ( (1-s) c0 + s c1 ) + t ( (1-s) c2 + s c3 ) – Horizontally: twice between left and right samples using fractional part of the texture coordinate (1-s, s): • i0 = (1-s) c0 + s c1 • i1 = (1-s) c2 + s c3 – Vertically: between two intermediate results (1-t, t): • c = (1-t) i0 + t i1 5 Filtering • Magnification (Zoom-in) – Map few texels onto many pixels – Reconstruction filter: Pixel • Nearest neighbor interpolation: – Take the nearest texel • Bilinear interpolation: – Interpolation between -
Texture Mapping: the Basics
CHAPTER 8 Texture Mapping: The Basics by Richard S. Wright Jr. WHAT YOU’LL LEARN IN THIS CHAPTER: How To Functions You’ll Use Load texture images glTexImage/glTexSubImage Map textures to geometry glTexCoord Change the texture environment glTexEnv Set texture mapping parameters glTexParameter Generate mipmaps gluBuildMipmaps Manage multiple textures glBindTexture In the preceding chapter, we covered in detail the groundwork for loading image data into OpenGL. Image data, unless modified by pixel zoom, generally has a one-to-one corre- spondence between a pixel in an image and a pixel on the screen. In fact, this is where we get the term pixel (picture element). In this chapter, we extend this knowledge further by applying images to three-dimensional primitives. When we apply image data to a geomet- ric primitive, we call this a texture or texture map. Figure 8.1 shows the dramatic difference that can be achieved by texture mapping geometry. The cube on the left is a lit and shaded featureless surface, whereas the cube on the right shows a richness in detail that can be reasonably achieved only with texture mapping. 304 CHAPTER 8 Texture Mapping: The Basics FIGURE 8.1 The stark contrast between textured and untextured geometry. A texture image when loaded has the same makeup and arrangement as pixmaps, but now a one-to-one correspondence seldom exists between texels (the individual picture elements in a texture) and pixels on the screen. This chapter covers the basics of loading a texture map into memory and all the ways in which it may be mapped to and applied to geomet- ric primitives. -
Spatio-Temporal Upsampling on the GPU
Spatio-Temporal Upsampling on the GPU Robert Herzog∗ Elmar Eisemanny Karol Myszkowskiz H.-P. Seidel MPI Informatik Saarland University / MPI / Tel´ ecom´ ParisTech MPI Informatik MPI Informatik Reference Temporally-amortized Upsampling Spatial Upsampling Our Spatio-temporal Upsampling 2 fps 15 fps (PSNR 65.6) 24 fps (PSNR 66.5) 22 fps (PSNR 67.4) Figure 1: Comparison of different upsampling schemes in a fully dynamic scene with complex shading (indirect light and ambient occlusion). Abstract reduce rendering costs, suppress aliasing, and popping artifacts be- comes more and more attractive. Pixel processing is becoming increasingly expensive for real-time Our method is driven by the observation that high quality is most applications due to the complexity of today’s shaders and high- important for static elements, thus we can accept some loss if strong resolution framebuffers. However, most shading results are spa- differences occur. This has been shown to be a good assumption, tially or temporally coherent, which allows for sparse sampling and recently exploited for shadow computations [Scherzer et al. 2007]. reuse of neighboring pixel values. This paper proposes a simple To achieve our goal, we rely on a varying sampling pattern pro- framework for spatio-temporal upsampling on modern GPUs. In ducing a low-resolution image and keep several such samples over contrast to previous work, which focuses either on temporal or spa- time. Our idea is to integrate all these samples in a unified manner. tial processing on the GPU, we exploit coherence in both. Our al- gorithm combines adaptive motion-compensated filtering over time The heart of our method is a filtering strategy that combines sam- and geometry-aware upsampling in image space. -
Antialiasing Complex Global Illumination Effects in Path-Space
Antialiasing Complex Global Illumination Effects in Path-space Laurent Belcour1, Ling-Qi Yan2, Ravi Ramamoorthi3, and Derek Nowrouzezahrai1 1Universite´ de Montreal,´ 2UC Berkeley, 3UC San Diego We present the first method to efficiently and accurately predict antialias- imate surface footprints from this bandlimit1. We also merge ing footprints to pre-filter color-, normal-, and displacement-mapped ap- two independent unidirectional frequency analyses at path vertex pearance in the context of multi-bounce global illumination. We derive connections, where pixel and light footprints are propagated Fourier spectra for radiance and importance functions that allow us to com- independently across multiple scene interactions, in order to devise pute spatial-angular filtering footprints at path vertices, for both uni- and the first accurate bidirectional antialiasing approach. We apply our bi-directional path construction. We then use these footprints to antialias method to complex GI effects from surfaces with high-resolution reflectance modulated by high-resolution color, normal, and displacement normal, color, and displacement maps (Figures 6 to 11). maps encountered along a path. In doing so, we also unify the traditional path-space formulation of light-transport with our frequency-space inter- Our implementation is straightforward to integrate into modern pretation of global illumination pre-filtering. Our method is fully compat- renderers and we compare our filtered transport algorithms to path- ible with all existing single bounce pre-filtering appearance models, not sampling approaches with ray differentials [Igehy 1999] (when restricted by path length, and easy to implement atop existing path-space available), additionally employing different local appearance renderers. We illustrate its effectiveness on several radiometrically complex prefiltering methods (i.e., Heckbert’s diffuse color filtering [1986] scenarios where previous approaches either completely fail or require or- and Yan et al.’s specular normal map filtering [2014]). -
Design and Development of Stream Processor and Texture Filtering Unit for Graphics Processor Architecture IP Krishna Bhushan Vutukuru1, Sanket Dessai2 1- M.Sc
Design and Development of Stream Processor and Texture Filtering Unit for Graphics Processor Architecture IP Krishna Bhushan Vutukuru1, Sanket Dessai2 1- M.Sc. [Engg.] Student, 2- Assistant Professor Computer Engineering Dept., M. S. Ramaiah School of Advanced Studies, Bangalore 560 058. Abstract Graphical Processing Units (GPUs) have become an integral part of today’s mainstream computing systems. They are also being used as reprogrammable General Purpose GPUs (GP-GPUs) to perform complex scientific computations. Reconfigurability is an attractive approach to embedded systems allowing hardware level modification. Hence, there is a high demand for GPU designs based on reconfigurable hardware. This paper presents the architectural design, modelling and simulation of reconfigurable stream processor and texture filtering unit of a GPU. Stream processor consists of clusters of functional units which provide a bandwidth hierarchy, supporting hundreds of arithmetic units. The arithmetic cluster units are designed to exploit instruction level parallelism and subword parallelism within a cluster and data parallelism across the clusters. The texture filter unit is designed to process geometric data like vertices and convert these into pixels on the screen. This process involves number of operations, like circle and cube generation, rotator, and scaling. The texture filter unit is designed with all necessary hardware to deal with all the different filtering operations. For decreasing the area and power, a single controller is used to control data flow between clusters and between host processor and GPU. The designed architecture provides a high degree of scalability and flexibility to allow customization for unique applications. The designed stream processor and texture filtering unit are modelled in Verilog on Altera Quartus II and simulated using ModelSim tools. -
The Opengl Rendering Pipeline
The OpenGL Rendering Pipeline CSE 781 Winter 2010 Han-Wei Shen Brief History of OpenGL Originated from a proprietary API called Iris GL from Silicon Graphics, Inc. Provide access to graphics hardware capabilities at the lowest possible level that still provides hardware independence The evolution is controlled by OpenGL Architecture Review Board, or ARB. OpenGL 1.0 API finalized in 1992, first implementation in 1993 In 2006, OpenGL ARB became a workgroup of the Khronos Group 10 revisions since 1992 OpenGL Evolution 1.1 (1997): vertex arrays and texture objects 1.2 (1998): 3D textures 1.3 (2001): cubemap textures, compressed textures, multitextures 1.4 (2002): mipmap generation, shadow map textures, etc 1.5 (2003): vertex buffer object, shadow comparison functions, occlusion queries, non-power-of-2 textures OpenGL Evolution 2.0 (2004): vertex and fragment shading (GLSL 1.1), multiple render targets, etc 2.1 (2006): GLSL 1.2, pixel buffer objects, etc 3.0 (2008): GLSL 1.3, deprecation model, etc 3.1 (2009): GLSL 1.4, texture buffer objects, move much of deprecated functions to ARB compatible extension 3.2 (2009) OpenGL Extensions New features/functions are marked with prefix Supported only by one vendor NV_float_buffer (by nvidia) Supported by multiple vendors EXT_framebuffer_object Reviewed by ARB ARB_depth_texture Promoted to standard OpenGL API Deprecation Model, Contexts, and Profiles Redundant and In-efficient functions are deprecated – to be removed in the future glBegin(), glEnd() OpenGL Contexts – data -
Deferred Shading Tutorial
Deferred Shading Tutorial Fabio Policarpo1 Francisco Fonseca2 [email protected] [email protected] CheckMate Games1,2 Pontifical Catholic University of Rio de Janeiro2 ICAD/Igames/VisionLab 1. Introduction Techniques usually consider non-interactive a few years ago are now possible in real-time using the flexibility and speed of new programmable graphics hardware. An example of that is the deferred shading technique, which is an approach that postpones shading calculations for a fragment1 until the visibility of that fragment is completely determined. In other words, it implies that only fragments that really contribute to the resultant image are shaded. Although deferred shading has become practical for real-time applications in recent years, this technique was firstly published in 1988 by Michael Deering et al. [Deering88]. In that work, the authors proposed a VLSI system where a pipeline of triangle processors rasterizes the geometry, and then a pipeline of shading processors applies Phong shading [Phong75] with multiple light sources to such geometry. After the initial research performed by Deering et al., the next relevant work involving deferred shading was developed by Saito and Takahashi [Saito90] in 1990. The authors of this article proposed a rendering technique that produces 3D images that favor the recognition of shapes and patterns, since shapes can be readily understood if certain geometric properties are enhanced. In order to optimize the enhancement process, geometric properties of the surfaces are preserved as Geometric-Buffers (G-buffers). So, by using G-buffers as intermediate results, artificial enhancement processes are separated from geometric processes (projection and hidden surface removal) and physical processes (shading and texture mapping), and performed as a post-processing pass. -
Anti-Aliasing
Antialiasing & Texturing Steve Rotenberg CSE168: Rendering Algorithms UCSD, Spring 2017 Texture Minification • Consider a texture mapped triangle • Assume that we point sample our texture so that we use the nearest texel to the center of the pixel to get our color • If we are far enough away from the triangle so that individual texels in the texture end up being smaller than a single pixel in the framebuffer, we run into a potential problem • If the object (or camera) moves a tiny amount, we may see drastic changes in the pixel color, as different texels will rapidly pass in front of the pixel center • This causes a flickering problem known as shimmering or buzzing • Texture buzzing is an example of aliasing Small Triangles • A similar problem happens with very small triangles • If we shoot our a single ray right through the center of a pixel, then we are essentially point sampling the image • This has the potential to miss small triangles • If we have small, moving triangles, they may cause pixels to flicker on and off as they cross the pixel centers • A related problem can be seen when very thin triangles cause pixel gaps • These are more examples of aliasing problems Stairstepping • What about the jagged right angle patterns we see at the edges of triangles? • This is known as the stairstepping problem, also affectionately known as “the jaggies” • These can be visually distracting, especially for high contrast edges near horizontal or vertical • Stairstepping is another form of aliasing Moiré Patterns • When we try to render high detail -
Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware
Graphics Hardware (2004) T. Akenine-Möller, M. McCool (Editors) Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware Tim Foley, Mike Houston and Pat Hanrahan y Stanford University Abstract Partitioning fragment shaders into multiple rendering passes is an effective technique for virtualizing shading resource limits in graphics hardware. The Recursive Dominator Split (RDS) algorithm is a polynomial-time algo- rithm for partitioning fragment shaders for real-time rendering that has been shown to generate efficient partitions. RDS does not, however, work for shaders with multiple outputs, and does not optimize for hardware with support for multiple render targets. We present Merging Recursive Dominator Split (MRDS), an extension of the RDS algorithm to shaders with arbitrary numbers of outputs which can efficiently utilize hardware support for multiple render targets, as well as a new cost metric for evaluating the quality of multipass partitions on modern consumer graphics hardware. We demonstrate that partitions generated by our algorithm execute more efficiently than those generated by RDS alone, and that our cost model is effective in predicting the relative performance of multipass partitions. Categories and Subject Descriptors (according to ACM CCS): I.3.1 [Computer Graphics]: Graphics processors G.2.2 [Mathematics of Computing]: Graph AlgorithmsTrees 1. Introduction shaders [CNS∗02]. However, RDS is limited to operating on shaders with a single output color. Real-time shading languages for graphics hardware simplify the task of writing shader code that is portable across a range Recently, it has been shown that graphics hardware can of hardware and graphics APIs. However, most current high- also be used to run a large number of non-shading algorithms level shading language compilers do not virtualize platform- including ray tracing [PBMH02], fluid dynamics [HBSL03], specific resource limits such as number of instructions, input and stream processing based applications [BFH∗04].