Evolution of GPUs
Chris Seitz
PDF created with pdfFactory Pro trial version www.pdffactory.com Overview
Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing 1999-2000: Transform and lighting 2001: Programmable vertex shader 2002-2003: Programmable pixel shader 2004: Shader model 3.0 and 64-bit color support
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Real-Time Rendering
Graphics hardware enables real-time rendering Real-time means display rate at more than 10 images per second
3D Scene = Image = Collection of Array of pixels 3D primitives (triangles, lines, points)
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com PC Architecture
Motherboard
CPU
System Memory
Bus Port (PCI, AGP, PCIe)
Graphics Board
Video Memory
GPU
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Hardware Graphics Pipeline
Application 3D Geometry 2D Rasterization Pixels Stage Triangles Stage Triangles Stage
For each For each triangle vertex: triangle:
Transform Rasterize position: triangle 3D -> screen Interpolate Compute vertex attributes attributes Shade pixels
Resolve visibility
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Real-time Graphics 1997: RIVA 128 –3M Transistors
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Real-time Graphics 2004:
©2004 NVIDIA Corporation. All rights reserved. UnRealEngine 3.0 Images Courtesy of Epic Games
PDF created with pdfFactory Pro trial version www.pdffactory.com ‘95-‘98: Texture Mapping & Z-Buffer
CPU GPU
Application / Rasterization Stage Geometry Stage Raster Texture Rasterizer Operations Unit Unit
2D Triangles Bus (PCI) Frame 2D Triangles Textures Buffer Textures
System Memory Video Memory
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Raster Operations Unit (ROP)
Raster Texture Rasterizer Fragments Operations Unit Unit
Scissor Test Fragment tested against scissor rectangle Screen Position Alpha Test (x, y) Frame Buffer tested against reference value Alpha Value Stencil Test Stencil Buffer a stencil buffer value at (x, y) tested against reference value Z-Buffer Depth Z Test z Color Buffer tested against z-buffer value at (x, y) Alpha Blending Color (Visibility test) Pixels (r, g, b)
blended with color buffer value at (x, y)
Ksrc * Colorsrc + Kdst * Colorsrc (src = fragment, dst = color buffer)
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Texture Mapping
Triangle Mesh Base Texture (with UV coordinates) + =
Sampling Magnification, Minification Filtering Bilinear, Trilinear, Anisotropic Mipmapping Perspective Correct Interpolation
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Texture Mapping: Mipmapping& Filtering
or
Bilinear Filtering TrilinearFiltering
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Filtering Examples
Point Bilinear Trilinear Anisotropic (4X) & Trilinear
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Incoming
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Riva TNT –7M Transistors -1998
Multitexture
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com 1998: Multitexturing
CPU GPU
Application / Rasterization Stage Geometry Stage Multi Raster Rasterizer Texture Operations Unit Unit
2 2D Triangles Bus (AGP) Frame 2D Triangles Textures Buffer Textures
System Memory Video Memory
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Multitexturing
Base Texture modulated by Light Map
X
from UT2004 (c) Epic Games Inc. = Used with permission
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com GeForce 256 –23M Transistors -1999
Hardware Transform & Lighting
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com 1999-2000: Transform and Lighting
CPU GPU “Fixed Function Pipeline”
Application Geometry Stage Rasterization Stage Stage Transform Raster Register & Lighting Rasterizer Operations Texture Combiner Unit Unit Unit
3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures
System Memory Video Memory
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Transform and Lighting Unit (TnL)
Transform and Lighting Unit
Transform Lighting World Matrix Model or Object Space Material Properties World Space Light Properties Model-View Matrix View Matrix Vertex Color
Camera or Eye Space
Projection Matrix
Projection or Clip Space
Perspective Division & Viewport Matrix Vertex Diffuse & Screen or Window Space Specular Color
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Wanda and Bubble
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com GeForce 2
Pixel Shading Multitexture
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com GeForce 3
Programmable Vertex Shading
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com 2001: Programmable Vertex Shader
CPU GPU
Application Geometry Stage Rasterization Stage Stage Vertex Raster Rasterizer Register Shader Texture Operations (no flow w/ Z-Cull Combiner control) Unit Unit (4 Textures)
3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures
System Memory Video Memory
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Vertex Shader
A programmable processor for any per-vertex computation
void VertexShader( // Input per vertex in float4 positionInModelSpace, in float2 textureCoordinates, in float3 normal,
// Input per batch of triangles uniform float4x4 modelToProjection, uniform float3 lightDirection,
// Output per vertex out float4 positionInProjectionSpace, out float2 textureCoordinatesOutput, out float3 color ) { // Vertex transformation positionInProjectionSpace= mul(modelToProjection, positionInModelSpace);
// Texture coordinates copy textureCoordinatesOutput= textureCoordinates;
// Vertex color computation color = dot(lightDirection, normal); } ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Bump Mapping
Bump mapping involves fetching the per-pixel normal from a normal map texture (instead of using the interpolated vertex normal) in order to compute lighting at a given pixel
+ =
Diffuse light Normal Map Diffuse light with bumps
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Cubic Texture Mapping
Cubemap
z Environment Mapping Reflection vector (x, y,z) y is used to lookup x into the cubemap
Cubemaplookup in direction (x, y, z)
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Projective Texture Mapping
Projected Texture
(x, y, z, w) Texture Projection (x/w, y/w)
Projective Texture lookup
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Volume Texture Mapping
z (x,y,z) y Volume Texture x Solid Textures 3D Noise Noise Perturbation
Volume Texture lookup with position (x, y, z)
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Hardware Shadow Mapping
Shadow Map Computation z/w The shadow map contains the depth
(z/w) of the 3D points visible from (x, y, z, w) Spot the light’s point of view: (x/w, y/w) light
Shadow Rendering shadow map A 3D point (x, y, z, w) is in shadow if: value z/w< value of shadow map at (x/w, y/w) (x, y, z, w) Spot (x/w, y/w) A hardware shadow map lookup light returns the value of this comparison between 0 and 1
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Antialiasing: Examples
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Antialiasing: Supersampling & Multisampling
Supersampling: Compute color and Z at higher resolution and display averaged color to smooth out the visual artifacts Multisampling: Same thing except only Z is computed at higher resolution Multisampling performs antialiasing on primitive edges only
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com X-Isle: Dinosaur Isle
©2004 NVIDIA Corporation. All rights reserved. Image courtesy of Crytek
PDF created with pdfFactory Pro trial version www.pdffactory.com GeForce 4
Multiple Vertex and Pixel Shaders
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Wolfman
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com GeForce FX ->100M Transistors
Programmable Pixel Shading (DirectX 9.0)
Scalable Architecture
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com ‘02-‘03: Programmable Pixel Shader
CPU GPU
Application Geometry Stage Rasterization Stage Stage Vertex Pixel Raster Shader Rasterizer Shader (static & Texture Operations w/ Z-Cull (static flow dynamic Unit control) Unit flow control) (16 textures)
3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures
System Memory Video Memory
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com PC Graphics Software Architecture
CPU GPU
Application BUS Vertex Commands Pixel Frame Shader Buffer 3D API Shader Programs (OpenGL or DirectX) Geometry
Textures Vertex Pixel Driver Program Program
The application, 3D API and driver are written in C or C++ The vertex and pixel programs are written in a high-level shading language (DirectX HLSL, OpenGL Shading Language, Cg) Pushbuffer: Contains the commands to be executed on the GPU
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Pixel Shader
A programmable processor for any per-pixel computation void PixelShader( // Input per pixel in float2 textureCoordinates, in float3 normal,
// Input per batch of triangles uniform sampler2D baseTexture, uniform float3 lightDirection,
// Output per pixel out float3 color ) { // Texture lookup float3 baseColor= tex2D(baseTexture, textureCoordinates);
// Light computation float light = dot(lightDirection, normal);
// Pixel color computation color = baseColor* light; ©2004 NVIDIA Corporation. All rig}hts reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Shader: Static vs. Dynamic Flow Control
void Shader( ... // Input per vertex or per pixel in float3 normal,
// Input per batch of triangles uniform float3 lightDirection, uniform bool computeLight, Static Flow Control ... (condition varies ) per batch of triangles) { ... if (computeLight) { ... if (dot(lightDirection, normal)) { Dynamic Flow Control ... (condition varies } per vertex or pixel) ... } ... ©2004 NVIDIA Corporation. All rights reserved. }
PDF created with pdfFactory Pro trial version www.pdffactory.com ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com GeForce 6800 –220M Transistors
Shader Model 3.0
FP64 & High Dynamic Range
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com 2004: Shader Model 3.0 & 64-Bit Color Support
CPU GPU
Application Geometry Stage Rasterization Stage Stage Vertex Rasterizer Pixel Shader Raster w/ Z-Cull Shader (static & Texture (static & Operations dynamic dynamic flow Unit flow control) Unit control)
64-Bit Color
3D Bus Triangles (PCIe) Frame 3D Triangles Textures Buffer Textures
System Memory Video Memory
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Shader Model 3.0
Longer shaders → More complex shading Pixel shader: Dynamic flow control → Better performance Derivative instructions → Shader antialiasing Support for 32-bit floating-point precision → Fewer artifacts Face register → Faster two-sided lighting Vertex shader: Lord of the Rings™ The Battle for Middle-earth™ Texture access → Simulation on GPU, displacement mapping Geometry Instancing → Better performance
Far Cry ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com The GeForce 6 Series Amazing Real-time Effects
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series
2 Million Triangle Detail Mesh Models High Dynamic Range Rendering Fully Customizable Shaders
©2004 NVIDIA Corporation. All rights reserved. UnRealEngine 3.0 Images Courtesy of Epic Games
PDF created with pdfFactory Pro trial version www.pdffactory.com Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series
100 Million Triangle Source Content Scene High Dynamic Range Rendering
©2004 NVIDIA Corporation. All rights reserved. UnRealEngine 3.0 Images Courtesy of Epic Games
PDF created with pdfFactory Pro trial version www.pdffactory.com High Dynamic Range Imagery
The dynamic range of a scene is the ratio of the highest to the lowest luminance
Real-life scenes can have high dynamic ranges of several millions
Display and print devices have a low dynamic range of around 100
Tone mapping is the process of displaying high dynamic range images on those low dynamic range devices
High dynamic range images use floating-point colors
OpenEXR is a high dynamic range image format that is compatible with NVIDIA’s 64-bit color format
HDR Rendering Engine - Compute surface reflectance, save in HDR buffer Contributions from multiple lights are additive (blended) Add image-space special effects & Post to HDR buffer AA, Glow, Depth of Field, Motion Blur
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Real-Time Tone Mapping
The image is entirely computed in 64-bit color and tone-mapped for display
Renderings of the same scene, from low to high exposure
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Evolution of Performance
10000 CPU Frequency (GHz)
Bus Bandwidth (GB/sec) e Rat l Fill 1000 Pixe Pixel Fill Rate (MPixels/sec)
Vertex Rate (MVerts/sec)
100 ate Graphics Bandwidth x R rte (GB/sec) Ve idth ndw s Ba 10 phic Gra th dwid Ban Bus ency 1 Frequ CPU
0.1 1994 2004
4 MB 32 MB 64 MB 128 MB 256 MB 512 MB
DirectX 6 DirectX 7DirectX 8 DirectX 9
OpenGL 1.1 OpenGL 1.4 OpenGL 1.5
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Looking Ahead: Now + 10 years
CPU Frequency (GHz) 1000000 Bus Bandwidth (GB/sec) 127 Gvertx Pixel Fill Rate (MPixels/sec) 100000 Vertex Rate (MVerts/sec) Graphics flops (GFlops/sec) 10000 Graphics Bandwidth (GB/sec)
te s 1000 a lop ill R f l F ics ixe ph P te ra Ra G 100 x rte Ve
idth 100 GHz 10 ndw Ba US ncy B que .5 Mverts Fre PU 1 C
0.1 1994 2004 2014 100 MHz ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com NVIDIA SLI Multi-GPU
©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Progression of Graphics
Virtual Fighter Wanda Wolfman NV1 NV1x NV2x 1Mtrans 22Mtrans 63Mtrans
Dawn Nalu NV3x NV4x 130Mtrans 222M ©2004 NVIDIA Corporation. All rights reserved.
PDF created with pdfFactory Pro trial version www.pdffactory.com Thank You
PDF created with pdfFactory Pro trial version www.pdffactory.com