Evolution of NV40

Evolution of NV40 Chris Seitz Progression of Graphics Virtual Fighter Wanda Wolfman NV1 NV1x NV2x 1 Mtrans 22 Mtrans 63Mtrans Dawn Nalu NV3x NV4x 130Mtrans 222M ©2004 NVIDIA Corporation. All rights reserved. ‘95-‘98: Texture Mapping & Z-Buffer CPU GPU Application / Rasterization Stage Geometry Stage Raster Texture Rasterizer Operations Unit Unit 2D Triangles Bus (PCI) Frame 2D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. Multitexturing Base Texturemodulated by Light Map X from UT2004 (c) Epic Games Inc. = Used with permission ©2004 NVIDIA Corporation. All rights reserved. 1999-2000: Transform and Lighting CPU GPU “Fixed Function Pipeline” Application Geometry Stage Rasterization Stage Stage Transform Raster Register & Lighting Rasterizer Operations Texture Combiner Unit Unit Unit 3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. 2001: Programmable Vertex Shader CPU GPU Application Geometry Stage Rasterization Stage Stage Vertex Raster Rasterizer Register Shader Texture Operations (no flow w/ Z-Cull Combiner control) Unit Unit (4 Textures) 3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. Vertex Shader A programmable processor for any per-vertex computation void VertexShader( // Input per vertex in float4 positionInModelSpace, in float2 textureCoordinates, in float3 normal, // Input per batch of triangles uniform float4x4 modelToProjection, uniform float3 lightDirection, // Output per vertex out float4 positionInProjectionSpace, out float2 textureCoordinatesOutput, out float3 color ) { // Vertex transformation positionInProjectionSpace = mul(modelToProjection, positionInModelSpace); // Texture coordinates copy textureCoordinatesOutput = textureCoordinates; // Vertex color computation color = dot(lightDirection, normal); } ©2004 NVIDIA Corporation. All rights reserved. Bump Mapping Bump mapping involves fetching the per-pixel normal from a normal map texture (instead of using the interpolated vertex normal) in order to compute lighting at a given pixel + = Diffuse lightNormal Map Diffuse light with bumps ©2004 NVIDIA Corporation. All rights reserved. ‘02- ‘03: Programmable Pixel Shader CPU GPU Application Geometry Stage Rasterization Stage Stage Vertex Pixel Raster Shader Rasterizer Shader (static & Texture Operations w/ Z-Cull (static flow dynamic Unit control) Unit flow control) (16 textures) 3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. ©2004 NVIDIA Corporation. All rights reserved. GeForce 6800 Revolutionary performance & complete Shader Model 3.0 Complete Native Shader Model 3.0 Support Full support for shader model 3.0 Vertex Texture Fetch / Long programs / Pixel Shader flow control Full speed fp32 shading OpenEXR High Dynamic Range Rendering Floating point frame buffer blending Floating point texture filtering Unparalleled Performance 222M xtors / 0.13um @ IBM 6 vertex units / 16 pixel pipelines Next Generation Video VMR / High quality compositing Hardware MPEG encode / decode HDTV Output PCI Express ©2004 NVIDIA Corporation. All rights reserved. 2004: Shader Model 3.0 & 64-Bit Color Support CPU GPU Application Geometry Stage Rasterization Stage Stage Vertex Rasterizer Pixel Shader Raster w/ Z-Cull Shader (static & Texture (static & Operations dynamic dynamic flow Unit flow control) Unit control) 64-Bit Color 3D Bus Triangles (PCIe) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. Shader Model 3.0 Longer shaders → More complex shading Pixel shader: Lord of the Rings™ Dynamic flow control → Better performance The Battle for Middle-earth™ Derivative instructions → Shader antialiasing Support for 32-bit floating-point precision → Fewer artifacts Face register → Faster two-sided lighting Vertex shader: Texture access → Simulation on GPU, displacement mapping Far Cry Geometry Instancing → Better performance SpeedTree ©2004 NVIDIA Corporation. All rights reserved. Complete Native DirectX 9 Support DX9 DX9.0c Vertex Shader Model 2.0 3.0 Vtx Shader Instructions 256 216 (65,535) Displacement Mapping - 9 Vertex Texture Fetch - 9 Geometry Instancing - 9 Dynamic Flow Control - 9 Pixel Shader Model 2.0 3.0 Required Shader Precision fp24 fp32 Pixel Shader Instructions 96 216 (65,535) Subroutines - 9 Loops & Branches - 9 Dynamic Flow Control - 9 ©2004 NVIDIA Corporation. All rights reserved. GeForce 6800 Graphics Unparalleled Performance Up to 8x pixel shading performance - combination of 4x the pipes and 2x the math per pipe 2x vertex shading performance - MIMD architecture, dual- issue, very efficient branching Next generation UltraShadow - 4x the performance of NV35 32ppc for z/stencil rendering 256-bit DDR3 – 1.1 GHz DDR data rate. ©2004 NVIDIA Corporation. All rights reserved. GeForce 6800 series 3D Pipeline Triangle Setup Z-Cull Shader Instruction Dispatch L2 Tex Fragment Crossbar Memory Memory Memory Memory Partition Partition Partition Partition ©2004 NVIDIA Corporation. All rights reserved. Multiples ofGeForceFX5950 performanceonshaders NV40 ShaderPerformance ©2004 NVIDIA Corporation. All rights reserved. rights All Corporation. NVIDIA ©2004 10.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 UT UE3 Tomb Raider Splinter Cell Shadermark Protomark HL2 HDR Light Demo Halo Gunmetal Film FarCry DeusEx Aquamark 3DMark03 ©2004 NVIDIA Corporation. All rights reserved. Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series 2 Million Triangle Detail Mesh Models High Dynamic Range Rendering Fully Customizable Shaders ©2004 NVIDIA Corporation. All rights reserved. UnReal Engine 3.0 Images Courtesy of Epic Games Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series 100 Million Triangle Source Content Scene High Dynamic Range Rendering ©2004 NVIDIA Corporation. All rights reserved. UnReal Engine 3.0 Images Courtesy of Epic Games Demo: Real-Time Tone Mapping The image is entirely computed in 64-bit color and tone-mapped for display Renderings of the same scene, from low to high exposure ©2004 NVIDIA Corporation. All rights reserved. Looking Ahead: Now + 10 years 1000000 CPU Frequency (GHz) 100000 Bus Bandwidth (GB/sec) Pixel Fill Rate (MPixels/sec) 10000 Vertex Rate (MVerts/sec) Graphics flops (GFlops/sec) Graphics Bandwidth (GB/sec) 1000 100 127 Gvertx 10 Pixel Fill Rate ate 1 .5 Mverts tex R Ver s flops Graphic 0.1 BUS Bandwidth ©2004 NVIDIA Corporation. All rights reserved. CPU Frequency 1994 2004 2014 100 MHz 100 GHz NVIDIA SLI Multi-GPU ©2004 NVIDIA Corporation. All rights reserved. Thank You.

Evolution of NV40

Nvidia Geforce 6 Series Specifications

Energy-Efficient VLSI Architectures for Next

Release 65 Notes

Release 85 Notes

Shippensburg University Investment Management Program

Manycore GPU Architectures and Programming, Part 1

NVIDIA Previews Nforce4 SLI for Intel at Cebit 2005 11 March 2005

Lecture: Manycore GPU Architectures and Programming, Part 1

Tegra Perfhud ES 1.1 Quickstart Guide

Release 70 Notes

Cuda-Gdb Cuda Debugger

GPU Architecture:Architecture: Implicationsimplications && Trendstrends