Understanding the Graphics Pipeline
Total Page:16
File Type:pdf, Size:1020Kb
UnderstandingUnderstanding thethe graphicsgraphics pipelinepipeline LectureLecture 22 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider LectureLecture OutlineOutline ► AA historicalhistorical perspectiveperspective onon thethe graphicsgraphics pipelinepipeline Dimensions of innovation. Where we are today Fixed-function vs programmable pipelines ► AA closercloser looklook atat thethe fixedfixed functionfunction pipelinepipeline Walk thru the sequence of operations Reinterpret these as stream operations ► WeWe cancan programprogram thethe fixedfixed--functionfunction pipelinepipeline !! Some examples ► WhatWhat constitutesconstitutes datadata andand memory,memory, andand howhow accessaccess affectsaffects programprogram design.design. TheThe evolutionevolution ofof thethe pipelinepipeline Elements of the graphics pipeline: Parameters controlling design of the pipeline: 1. A scene description: vertices, triangles, colors, lighting 1. Where is the boundary between CPU and GPU ? 2. Transformations that map the scene to a camera viewpoint 2. What transfer method is used ? 3. “Effects”: texturing, shadow 3. What resources are provided mapping, lighting calculations at each step ? 4. Rasterizing: converting geometry 4. What units can access which into pixels GPU memory elements ? 5. Pixel processing: depth tests, stencil tests, and other per-pixel operations. GenerationGeneration I:I: 3dfx3dfx VoodooVoodoo (1996)(1996) • One of the first true 3D game cards • Worked by supplementing standard 2D video card. • Did not do vertex transformations: these were done in the CPU •Did dotexture mapping, z-buffering. http://accelenation.com/?ac.id.123.2 Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer CPU GPU PCI GenerationGeneration II:II: GeForce/RadeonGeForce/Radeon 75007500 (1998)(1998) • Main innovation: shifting the transformation and lighting calculations to the GPU • Allowed multi-texturing: giving bump maps, light maps, and others.. • Faster AGP bus instead of PCI http://accelenation.com/?ac.id.123.5 Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer GPU AGP GenerationGeneration III:III: GeForce3/RadeonGeForce3/Radeon 8500(2001)8500(2001) • For the first time, allowed limited amount of programmability in the vertex pipeline • Also allowed volume texturing and multi-sampling (for antialiasing) http://accelenation.com/?ac.id.123.7 Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer GPU AGP SmallSmall vertex vertex shadersshaders GenerationGeneration IV:IV: RadeonRadeon 9700/GeForce9700/GeForce FXFX (2002)(2002) • This generation is the first generation of fully-programmable graphics cards • Different versions have different resource limits on fragment/vertex programs http://accelenation.com/?ac.id.123.8 Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer AGP ProgrammableProgrammable ProgrammableProgrammable FragmentFragment VertexVertex shader shader ProcessorProcessor Texture Memory GenerationGeneration IV.V:IV.V: GeForce6/X800GeForce6/X800 (2004)(2004) Not exactly a quantum leap, but… ► Simultaneous rendering to multiple buffers ► True conditionals and loops ► Higher precision throughput in the pipeline (64 bits end-to-end, compared to 32 bits earlier.) ► PCIe bus ► More memory/program length/texture accesses ► Texture access by vertex shader Rasterization Vertex Primitive Raster Frame Vertex Primitive and Frame Transforms Assembly Operations Buffer Transforms Assembly Interpolation Buffer AGP ProgrammableProgrammable ProgrammableProgrammable FragmentFragment VertexVertex shader shader ProcessorProcessor Texture Memory Texture Memory GenerationGeneration V:V: GeForce8800/HD2900GeForce8800/HD2900 (2006)(2006) Complete quantum leap ► Ground-up rewrite of GPU ► Support for DirectX 10, and all it implies (more on this later) ► Geometry Shader ► Support for General GPU programming ► Shared Memory (NVIDIA only) Input Programmable ProgrammableProgrammable Input ProgrammableProgrammable Raster Assembler Geometry PixelPixel Assembler VertexVertex shader shader Operations Shader ShaderShader AGP Output Merger Fixed-function pipeline 3D API Commands 3D3D API: API: 3D3D OpenGLOpenGL or or ApplicationApplication Direct3DDirect3D OrOr Game Game CPU-GPU Boundary (AGP/PCIe) Data Stream Command & GPU Vertex Pixel Index Assembled Pixel Primitives Location Updates Stream Stream Rasterization GPU Primitive Raster Frame GPU Primitive and Frame Front End Assembly Operations Buffer Front End Assembly Interpolation Buffer Pre-transformed Pre-transformed Fragments Vertices ProgrammableProgrammable Vertices ProgrammableProgrammable Fragments Transformed VertexVertex FragmentFragment Transformed ProcessorProcessor ProcessorProcessor AA closercloser looklook atat thethe fixedfixed--functionfunction pipelinepipeline PipelinePipeline InputInput Vertex Image F(x,y) = (r,g,b,a) (x, y, z) (r, g, b,a) (Nx,Ny,Nz) (tx, ty,[tz]) (tx, ty) (tx, ty) Material properties* ModelViewModelView TransformationTransformation ►►VerticesVertices mappedmapped fromfrom objectobject spacespace toto worldworld spacespace ►►MM == modelmodel transformationtransformation (scene)(scene) ►►VV == viewview transformationtransformation (camera)(camera) Each matrix transform X’ X is applied to each vertex in the input Y Y’ stream. Think of this Z’ M * V * Z as a kernel operator. W’ 1 LightingLighting LightingLighting informationinformation isis combinedcombined withwith normalsnormals andand otherother parametersparameters atat eacheach vertexvertex inin orderorder toto createcreate newnew colors.colors. Color(v) = emissive + ambient + diffuse + specular Each term in the right hand side is a function of the vertex color, position, normal and material properties. Clipping/Projection/Viewport(3D)Clipping/Projection/Viewport(3D) ►►MoreMore matrixmatrix transformationstransformations thatthat operateoperate onon aa vertexvertex toto transformtransform itit intointo thethe viewportviewport space.space. ►►NoteNote thatthat aa vertexvertex maymay bebe eliminatedeliminated fromfrom thethe inputinput streamstream (if(if itit isis clipped).clipped). ►►TheThe viewportviewport isis twotwo--dimensional:dimensional: however,however, vertexvertex zz--valuevalue isis retainedretained forfor depthdepth testing.testing. Clip test is first example of a conditional in the pipeline. However, it is not a fully general conditional. Why ? Rasterizing+InterpolationRasterizing+Interpolation ►►AllAll primitivesprimitives areare nownow convertedconverted toto fragments.fragments. ►►DataData typetype changechange !! VerticesVertices toto fragmentsfragments Texture coordinates are interpolated from Fragment attributes: texture coordinates of vertices. (r,g,b,a) This gives us a linear interpolation operator (x,y,z,w) for free. VERY USEFUL ! (tx,ty), … PerPer--fragmentfragment operationsoperations ►►TheThe rasterizerrasterizer producesproduces aa streamstream ofof fragments.fragments. ►►EachEach fragmentfragment undergoesundergoes aa seriesseries ofof teststests withwith increasingincreasing complexity.complexity. Test 1: Scissor Scissor test is analogous to clipping If (fragment lies in fixed rectangle) operation in fragment space instead of let it pass else discard it vertex space. Test 2: Alpha Alpha test is a slightly more general If( fragment.a >= <constant> ) conditional. Why ? let it pass else discard it. PerPer--fragmentfragment operationsoperations ► StencilStencil test:test: S(xS(x,, y)y) isis stencilstencil bufferbuffer valuevalue forfor fragmentfragment withwith coordinatescoordinates ((x,yx,y)) ► IfIf f(S(x,yf(S(x,y)),)), letlet pixelpixel passpass elseelse killkill it.it. UpdateUpdate S(xS(x,, y)y) conditionallyconditionally dependingdepending onon f(S(x,yf(S(x,y)))) andand g(D(x,yg(D(x,y)).)). ► DepthDepth test:test: D(xD(x,, y)y) isis depthdepth bufferbuffer value.value. ► IfIf g(D(x,yg(D(x,y)))) letlet pixelpixel passpass elseelse killkill it.it. UpdateUpdate D(x,yD(x,y)) conditionally.conditionally. PerPer--fragmentfragment operationsoperations ► StencilStencil andand depthdepth teststests areare moremore generalgeneral conditionals.conditionals. WhyWhy ?? ► TheseThese areare thethe onlyonly teststests thatthat cancan changechange thethe statestate ofof internalinternal storagestorage (stencil(stencil buffer,buffer, depthdepth buffer).buffer). ► OneOne ofof thethe updateupdate operationsoperations forfor thethe stencilstencil bufferbuffer isis aa ““countcount”” operation.operation. RememberRemember this!this! ► Unfortunately,Unfortunately, stencilstencil andand depthdepth buffersbuffers havehave lowerlower precisionprecision (8,(8, 2424 bitsbits respresp.).) PostPost--processingprocessing ►►Blending:Blending: pixelspixels areare accumulatedaccumulated intointo finalfinal framebufferframebuffer storagestorage newnew--valval == oldold--valval opop pixelpixel--valuevalue IfIf opop isis +,+, wewe cancan sumsum allall thethe (say)(say) redred componentscomponents ofof pixelspixels thatthat passpass allall tests.tests. Problem:Problem: InIn generation<=generation<= IV,IV, blendingblending cancan onlyonly bebe donedone inin 88--bitbit channelschannels (the(the channelschannels sentsent toto thethe