Administrivia
Evolution of the Tip: google “cis 565” Programmable Slides posted before each class Graphics Pipeline Tentative assignment dates on website 1st assignment handed out today Patrick Cozzi Write concisely University of Pennsylvania Due start of class, one week from today CIS 565 - Spring 2011 Google group in progress FYI. GDC Early Registration - 01/24
Survey Results Survey Results
15/23 – graphics experience Class interests Most students have usable video cards Pure architecture Game rendering Lerk – don’t be scared Physical simulations I want to be a Toys R Us kid too Animation Vision algorithms Image/video processing …
1 Course Roadmap Agenda
Graphics Pipeline (GLSL) Why program the GPU? GPGPU (GLSL) Graphics Review Briefly Evolution of the Programmable Graphics GPU Computing (CUDA, OpenCL) Pipeline Choose your own adventure Understand the past Student Presentation Final Project Goal : Prepare you for your presentation and project
Why Program the GPU? Why Program the GPU?
Graph from: http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_C_Programming_Guide.pdf Graph from: http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_C_Programming_Guide.pdf
2 Why Program the GPU? NVIDIA GPU Evolution
Compute Intel Core i7 – 4 cores – 100 GFLOP NVIDIA GTX280 – 240 cores – 1 TFLOP
Memory Bandwidth System Memory – 60 GB/s NVIDIA GT200 – 150 GB/s
Install Base Over 200 million NVIDIA G80s shipped
Numbers from Programming Massively Parallel Processors . Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
Graphics Review Graphics Review: Modeling
Modeling Modeling Rendering Polygons vs Triangles How do you store a triangle mesh? Animation Implicit Surfaces Height maps …
3 Triangles Triangles
Image courtesy of A K Peters, Ltd. www.virtualglobebook.com Image courtesy of A K Peters, Ltd. www.virtualglobebook.com. Imagery from NASA Visible Earth: visibleearth.nasa.gov.
Triangles Triangles
4 Implicit Surfaces Height Maps
Images from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch01.html Image courtesy of A K Peters, Ltd. www.virtualglobebook.com
Graphics Review: Rendering Rasterization
Rendering Goal: Assign color to pixels Two Parts Visible surfaces What is in front of what for a given view Shading Simulate the interaction of material and light to produce a pixel color What about ray tracing?
5 Visible Surfaces Visible Surfaces
Z-Buffer / Depth Buffer Fragment vs Pixel
Image courtesy of A K Peters, Ltd. www.virtualglobebook.com Image courtesy of A K Peters, Ltd. www.virtualglobebook.com
Shading Shading
Images courtesy of A K Peters, Ltd. www.virtualglobebook.com Image from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html
6 Graphics Pipeline Graphics Pipeline
Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer
Scissor Test Stencil Test Depth Test Blending Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/
Graphics Pipeline Graphics Pipeline
Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/
7 Graphics Pipeline Graphics Review: Animation
Move the camera and/or agents, and re- render the scene In less than 16.6 ms (60 fps)
Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/
Evolution of the Programmable Early 90s – Pre GPU Graphics Pipeline
Pre GPU Fixed function GPU Programmable GPU Unified Shader Processors
Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf
8 Why GPUs? Generation I: 3dfx Voodoo (1996)
• Did not do vertex transformations: Exploit Parallelism these were done in the CPU Pipeline parallel • Did do texture mapping, z-buffering. Data-parallel
CPU and GPU executing in parallel Image from “7 years of Graphics” Hardware: texture filtering, MAD, etc.
Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer
CPU GPU PCI
Slide adapted from Suresh Venkatasubramanian and Joe Kider
Aside: Mario Kart 64 Aside: Mario Kart Wii
High fragment load / low vertex load High fragment load / low vertex load?
Image from: http://www.gamespot.com/users/my_shoe/ Image from: http://wii.ign.com/dor/objects/949580/mario-kart-wii/images/
9 Generation II: GeForce/Radeon 7500 (1998) Generation III: GeForce3/Radeon 8500(2001)
• Main innovation : shifting the • For the first time, allowed limited transformation and lighting amount of programmability in the calculations to the GPU vertex pipeline • Allowed multi-texturing: giving bump • Also allowed volume texturing and maps, light maps, and others.. multi-sampling (for antialiasing) • Faster AGP bus instead of PCI Image from “7 years of Graphics” Image from “7 years of Graphics”
Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Rasterization Transforms Assembly Interpolation Buffer Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer GPU AGP GPU SmallSmall vertex vertex AGP shadersshaders
Slide from Suresh Venkatasubramanian and Joe Kider Slide from Suresh Venkatasubramanian and Joe Kider
Generation IV: Radeon 9700/GeForce FX (2002) Generation IV.V: GeForce6/X800 (2004)
• This generation is the first generation Simultaneous rendering to multiple buffers of fully-programmable graphics cards True conditionals and loops PCIe bus • Different versions have different Vertex texture fetch resource limits on fragment/vertex programs Image from “7 years of Graphics”
Rasterization Rasterization Vertex Primitive Raster Frame Vertex Primitive Raster Vertex Primitive and Frame Vertex Primitive and Transforms Assembly Operations Buffer Transforms Assembly Operations Transforms Assembly Interpolation Buffer Transforms Assembly Interpolation
PCIe AGP ProgrammableProgrammable ProgrammableProgrammable ProgrammableProgrammable ProgrammableProgrammable FragmentFragment FragmentFragment VertexVertex shader shader VertexVertex shader shader ProcessorProcessor ProcessorProcessor Texture Memory Texture Memory Texture Memory
Slide from Suresh Venkatasubramanian and Joe Kider Slide adapted from Suresh Venkatasubramanian and Joe Kider
10 NVIDIA NV40 Architecture Generation V: GeForce8800/HD2900 (2006) Ground-up GPU redesign 6 vertex shader units Support for Direct3D 10 / OpenGL 3 Geometry Shaders Vertex Texture Fetch Stream out / transform-feedback Unified shader processors Support for General GPU programming
Input Programmable ProgrammableProgrammable Input ProgrammableProgrammable Raster Assembler Geometry PixelPixel (Fragment) (Fragment) Assembler VertexVertex shader shader Operations Shader ShaderShader
16 fragment PCIe shader units Output Merger
Image from GPU Gems 2: http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter30.html Slide adapted from Suresh Venkatasubramanian and Joe Kider
D3D 10 Pipeline Geometry Shaders: Point Sprites
Image from David Blythe : http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf
11 Geometry Shaders: Point Sprites Geometry Shaders
Image from David Blythe : http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf
NVIDIA G80 Architecture NVIDIA G80 Architecture
Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
12 Why Unify Shader Processors? Why Unify Shader Processors?
Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
Unified Shader Processors Terminology
Shader Direct3D OpenGL Video card Model Example NVIDIA GeForce 6800 2 9 2.x ATI Radeon X800
NVIDIA GeForce 8800 3 10.x 3.x ATI Radeon HD 2900
NVIDIA GeForce GTX 480 4 11.x 4.x ATI Radeon HD 5870
Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf
13 Shader Capabilities Shader Capabilities
Table courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ Table courtesy of A K Peters, Ltd. http://www.realtimerendering.com/
Evolution of the Programmable Evolution of the Programmable Graphics Pipeline Graphics Pipeline
Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf
14 Evolution of the Programmable Graphics Pipeline New Tool: AMD System Monitor
Not covered today: Released 01/04/2011
SM 5 / D3D 11 / GL 4 http://support.amd.com/us/kbarticles/Pages/AMDSystemMonitor.aspx Tessellation shaders *cough* student presentation *cough* Later this semester: NVIDIA Fermi Dual warp scheduler Configurable L1 / shared memory Double precision …
15