Evolution of the Programmable Graphics Pipeline Graphics Pipeline

Administrivia Evolution of the Tip: google “cis 565” Programmable Slides posted before each class Graphics Pipeline Tentative assignment dates on website 1st assignment handed out today Patrick Cozzi Write concisely University of Pennsylvania Due start of class, one week from today CIS 565 - Spring 2011 Google group in progress FYI. GDC Early Registration - 01/24 Survey Results Survey Results 15/23 – graphics experience Class interests Most students have usable video cards Pure architecture Game rendering Lerk – don’t be scared Physical simulations I want to be a Toys R Us kid too Animation Vision algorithms Image/video processing … 1 Course Roadmap Agenda Graphics Pipeline (GLSL) Why program the GPU? GPGPU (GLSL) Graphics Review Briefly Evolution of the Programmable Graphics GPU Computing (CUDA, OpenCL) Pipeline Choose your own adventure Understand the past Student Presentation Final Project Goal : Prepare you for your presentation and project Why Program the GPU? Why Program the GPU? Graph from: http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_C_Programming_Guide.pdf Graph from: http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_C_Programming_Guide.pdf 2 Why Program the GPU? NVIDIA GPU Evolution Compute Intel Core i7 – 4 cores – 100 GFLOP NVIDIA GTX280 – 240 cores – 1 TFLOP Memory Bandwidth System Memory – 60 GB/s NVIDIA GT200 – 150 GB/s Install Base Over 200 million NVIDIA G80s shipped Numbers from Programming Massively Parallel Processors . Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf Graphics Review Graphics Review: Modeling Modeling Modeling Rendering Polygons vs Triangles How do you store a triangle mesh? Animation Implicit Surfaces Height maps … 3 Triangles Triangles Image courtesy of A K Peters, Ltd. www.virtualglobebook.com Image courtesy of A K Peters, Ltd. www.virtualglobebook.com. Imagery from NASA Visible Earth: visibleearth.nasa.gov. Triangles Triangles 4 Implicit Surfaces Height Maps Images from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch01.html Image courtesy of A K Peters, Ltd. www.virtualglobebook.com Graphics Review: Rendering Rasterization Rendering Goal: Assign color to pixels Two Parts Visible surfaces What is in front of what for a given view Shading Simulate the interaction of material and light to produce a pixel color What about ray tracing? 5 Visible Surfaces Visible Surfaces Z-Buffer / Depth Buffer Fragment vs Pixel Image courtesy of A K Peters, Ltd. www.virtualglobebook.com Image courtesy of A K Peters, Ltd. www.virtualglobebook.com Shading Shading Images courtesy of A K Peters, Ltd. www.virtualglobebook.com Image from GPU Gems 3: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html 6 Graphics Pipeline Graphics Pipeline Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer Scissor Test Stencil Test Depth Test Blending Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ Graphics Pipeline Graphics Pipeline Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ 7 Graphics Pipeline Graphics Review: Animation Move the camera and/or agents, and re- render the scene In less than 16.6 ms (60 fps) Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ Evolution of the Programmable Early 90s – Pre GPU Graphics Pipeline Pre GPU Fixed function GPU Programmable GPU Unified Shader Processors Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf 8 Why GPUs? Generation I: 3dfx Voodoo (1996) • Did not do vertex transformations: Exploit Parallelism these were done in the CPU Pipeline parallel • Did do texture mapping, z-buffering. Data-parallel CPU and GPU executing in parallel Image from “7 years of Graphics” Hardware: texture filtering, MAD, etc. Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer CPU GPU PCI Slide adapted from Suresh Venkatasubramanian and Joe Kider Aside: Mario Kart 64 Aside: Mario Kart Wii High fragment load / low vertex load High fragment load / low vertex load? Image from: http://www.gamespot.com/users/my_shoe/ Image from: http://wii.ign.com/dor/objects/949580/mario-kart-wii/images/ 9 Generation II: GeForce/Radeon 7500 (1998) Generation III: GeForce3/Radeon 8500(2001) • Main innovation : shifting the • For the first time, allowed limited transformation and lighting amount of programmability in the calculations to the GPU vertex pipeline • Allowed multi-texturing: giving bump • Also allowed volume texturing and maps, light maps, and others.. multi-sampling (for antialiasing) • Faster AGP bus instead of PCI Image from “7 years of Graphics” Image from “7 years of Graphics” Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Rasterization Transforms Assembly Interpolation Buffer Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer GPU AGP GPU SmallSmall vertex vertex AGP shadersshaders Slide from Suresh Venkatasubramanian and Joe Kider Slide from Suresh Venkatasubramanian and Joe Kider Generation IV: Radeon 9700/GeForce FX (2002) Generation IV.V: GeForce6/X800 (2004) • This generation is the first generation Simultaneous rendering to multiple buffers of fully-programmable graphics cards True conditionals and loops PCIe bus • Different versions have different Vertex texture fetch resource limits on fragment/vertex programs Image from “7 years of Graphics” Rasterization Rasterization Vertex Primitive Raster Frame Vertex Primitive Raster Vertex Primitive and Frame Vertex Primitive and Transforms Assembly Operations Buffer Transforms Assembly Operations Transforms Assembly Interpolation Buffer Transforms Assembly Interpolation PCIe AGP ProgrammableProgrammable ProgrammableProgrammable ProgrammableProgrammable ProgrammableProgrammable FragmentFragment FragmentFragment VertexVertex shader shader VertexVertex shader shader ProcessorProcessor ProcessorProcessor Texture Memory Texture Memory Texture Memory Slide from Suresh Venkatasubramanian and Joe Kider Slide adapted from Suresh Venkatasubramanian and Joe Kider 10 NVIDIA NV40 Architecture Generation V: GeForce8800/HD2900 (2006) Ground-up GPU redesign 6 vertex shader units Support for Direct3D 10 / OpenGL 3 Geometry Shaders Vertex Texture Fetch Stream out / transform-feedback Unified shader processors Support for General GPU programming Input Programmable ProgrammableProgrammable Input ProgrammableProgrammable Raster Assembler Geometry PixelPixel (Fragment) (Fragment) Assembler VertexVertex shader shader Operations Shader ShaderShader 16 fragment PCIe shader units Output Merger Image from GPU Gems 2: http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter30.html Slide adapted from Suresh Venkatasubramanian and Joe Kider D3D 10 Pipeline Geometry Shaders: Point Sprites Image from David Blythe : http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf 11 Geometry Shaders: Point Sprites Geometry Shaders Image from David Blythe : http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf NVIDIA G80 Architecture NVIDIA G80 Architecture Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf 12 Why Unify Shader Processors? Why Unify Shader Processors? Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf Unified Shader Processors Terminology Shader Direct3D OpenGL Video card Model Example NVIDIA GeForce 6800 2 9 2.x ATI Radeon X800 NVIDIA GeForce 8800 3 10.x 3.x ATI Radeon HD 2900 NVIDIA GeForce GTX 480 4 11.x 4.x ATI Radeon HD 5870 Slide from David Luebke: http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf 13 Shader Capabilities Shader Capabilities Table courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ Table courtesy of A K Peters, Ltd. http://www.realtimerendering.com/ Evolution of the Programmable Evolution of the Programmable Graphics Pipeline Graphics Pipeline Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf Slide from Mike Houston: http://s09.idav.ucdavis.edu/talks/01-BPS-SIGGRAPH09-mhouston.pdf 14 Evolution of the Programmable Graphics Pipeline New Tool: AMD System Monitor Not covered today: Released 01/04/2011 SM 5 / D3D 11 / GL 4 http://support.amd.com/us/kbarticles/Pages/AMDSystemMonitor.aspx Tessellation shaders *cough* student presentation *cough* Later this semester: NVIDIA Fermi Dual warp scheduler Configurable L1 / shared memory Double precision … 15.

Evolution of the Programmable Graphics Pipeline Graphics Pipeline

Realizm Data Sheet200.Indd

From a Programmable Pipeline to an Efficient Stream Processor

Evolution of the Programmable Graphics Pipeline

Opengl Shading Language Course Chapter 1 – Introduction to GLSL By

A Survey of General-Purpose Computation on Graphics Hardware

Advanced Computer Graphics to Do Motivation Real-Time Rendering

All the Pipelines – Journey Through the Gpu Lou Kramer, Developer Technology Engineer, Amd Overview

Graphics and Computing Gpus

GPU Architecture

Efficient GPU Rendering of Subdivision Surfaces Using Adaptive Quadtrees

Real-Time High Quality Rendering Outline of Lecture Basic Hardware

Extending the Graphic Pipeline with New GPU-Accelerated Primitives