Programmability Features of Graphics Hardware Michael Doggett ATI Research [email protected] Outline Graphics Hardware Transform Stage Vertex Engine OpenGL ARB vertex program Fragment Stage Pixel Pipeline OpenGL ARB fragment program Examples Mandelbrot FFT Displacement Mapping Programming Options OpenGL Shading Language overview Computation on GPUs GH Programmability 2 Michael Doggett Graphics Hardware Surface Hardware pipeline Transform Rasterization Fragment FrameBuffer GH Programmability 3 Michael Doggett Graphics Hardware Surface Hardware pipeline Based on RADEON 9800 380MHz TTransformT T T SIMD stages Transform Rasterization Fragment Highly parallel 4 component vector registers and F FragmentF F F F F F F operations High computation/bandwidth ratio Arithmetic intensity A FragmentA A A A A A A Memory GH Programmability 4 Michael Doggett Graphics Hardware Surface Surface Input – 3D Scene Typically triangles made up Vertices of vertices Transform 1D buffers of vertex data Rasterization Fragment FrameBuffer GH Programmability 5 Michael Doggett Graphics Hardware Surface Input – 3D Scene Typically triangles made up of vertices Transform 1D buffers of vertex data Programs Vertex and Fragment programs Rasterization Fragment Programs FrameBuffer GH Programmability 6 Michael Doggett Graphics Hardware Surface Input – 3D Scene Typically triangles made up of vertices Transform 1D buffers of vertex data Vertex and Fragment programs Rasterization Textures Textures Random memory access Fragment FrameBuffer GH Programmability 7 Michael Doggett Graphics Hardware Surface Input – 3D Scene Typically triangles made up of vertices Transform 1D buffers of vertex data Vertex and Fragment programs Rasterization Textures Random memory access Fragment Output – 2D Image Color, depth and stencil Image FrameBuffer GH Programmability 8 Michael Doggett Graphics Hardware Surface Surface Focus on programmable stages Vertices Transform Transform Programs Fragment Rasterization Textures Fragment Programs Image FrameBuffer GH Programmability 9 Michael Doggett Graphics Hardware Evolution Year 2000 2001 2002 ATI RADEON 7500 (R100) 8500 (R200) 9700 (R300) DirectX 7 8 9 Stages Surface CPU CPU/GPU CPU/GPU Transform State VS 1.1 VS 2.0 128 instrs 256 instrs Control Flow Rasterization State State State Fragment State 12 tex/16 alu 32 tex/ 64 alu s3.12 Fixed point s16e7 Floating Point PS 1.4 PS 2.0 FrameBuffer State State State GH Programmability 10 Michael Doggett Surface Generation Stage S Newest stage T DX8 NPatches State controlled R Geometric calculations based on complex surfaces F FB GH Programmability 11 Michael Doggett Transform Stage S Includes: T ModelView Transformation Vertex Lighting Perspective Transformation R Tweening/Skinning Per-vertex operations F Originally microcoded on DSPs or CPU Controlled by fixed function state FB Programmable Vertex Engine Lindholm et al., SIGGRAPH 2001 GH Programmability 12 Michael Doggett Vertex Engine 1616 4 parallel vertex engines Vertex Data Input Vertex stream Constants Vertex Output Engine Position, Color, Tex Coords Output Program (Shader) has up to 256 instructions GH Programmability 13 Michael Doggett Vertex Engine Registers Vertex Data Four component floating point vectors 11 12 Read-write temp 1212 Address Vertex registers Temp Engine Output registers Constant position, color, fogcoord, pointsize, 256256 texcoord Output Vertex shader outputs are pixel shader inputs GH Programmability 14 Michael Doggett Vertex Program Instructions Basic arithmetic operators ADD, MAD, MUL, SUB Comparison MAX, MIN, SGE, SLT Dot and cross Product DP3, DP4, DPH, XPD Exponential functions EX2, EXP, LG2, LOG Other ABS, ARL, DST, FLR, FRC, LIT, MOV, POW, RCP, RSQ, SWZ GH Programmability 15 Michael Doggett Register Modifiers Source swizzle selects which components to use iPosition.[xyzw][xyzw][xyzw][xyzw] e.g. iPosition.yzxw Destination mask to select individual component oColor.{x}{y}{z}{w} Source negation -iNormal GH Programmability 16 Michael Doggett Simple Vertex Program !!ARBvp1.0!!ARBvp1.0 ATTRIBATTRIBiPos iPos = = vertex.position;vertex.position; PARAMPARAMmvp[4] mvp[4] == {{ state.matrix.mvpstate.matrix.mvp }; }; PARAMPARAMambientCol ambientCol = = state.lightprod[0].ambient;state.lightprod[0].ambient; OUTPUTOUTPUToPos oPos = = result.position;result.position; OUTPUTOUTPUToColor oColor = = result.color;result.color; ## TransformTransform thethe vertexvertex toto clipclip coordinates.coordinates. DP4DP4oPos.x, oPos.x, mvp[0],mvp[0], iPos;iPos; DP4DP4oPos.y, oPos.y, mvp[1],mvp[1], iPos;iPos; DP4DP4oPos.z, oPos.z, mvp[2],mvp[2], iPos;iPos; DP4DP4oPos.w, oPos.w, mvp[3],mvp[3], iPos;iPos; ## WriteWrite outout aa color.color. MOVMOVoColor, oColor, ambientCol;ambientCol; ENDEND GH Programmability 17 Michael Doggett RADEON 9800 Vertex Shader DirectX 9.0 Vertex Shader 2.0 Same arithmetic instructions Constant based control flow capabilities Loops, branches, subroutines CALL, LOOP, ENDLOOP, JUMP, JNZ, LABEL, REPEAT, ENDREPEAT, RETURN 16 Integer constants 16 Boolean constants Loop counter GH Programmability 18 Michael Doggett Rasterization Stage S Includes: T Triangle Setup Viewport Clipping R Viewport Transform Rasterization F Not programmable, some control through state High precision vertex parameter interpolators FB Position, normal, color, tex coords GH Programmability 19 Michael Doggett Fragment Stage S Includes: T Texturing arbitrary memory fetch (Gather) fixed point filtering R Point, Linear, Bi-linear, Tri-linear, Anisotropic Fragment (Per-Pixel) Lighting F RADEON 9700 introduced floating point 32 Texture and 64 ALU instructions FB Co-issue vector and scalar instruction GH Programmability 20 Michael Doggett Pixel Pipeline 88 Texture 8 parallel floating Color point pixel pipelines Coords Input Color, Texcoords PP Output Color, Depth Output surfaces 16, 32 bit float Color 16 bit fixed GH Programmability 21 Michael Doggett Pixel Pipeline 88 Texture Registers Color Coords Four component 1212 24bit floating point performance and Temp 3232 precision Constant PP Registers 12 read-write temp Texture registers Samplers 16 Texture images 1616 Texture instructions Color KIL, TEX, TXB, TXP GH Programmability 22 Michael Doggett Simple Fragment Program !!ARBfp1.0!!ARBfp1.0 TEMPTEMPtemp; temp; ATTRIBATTRIBtex0 tex0 == fragment.texcoord[0];fragment.texcoord[0]; ## inputinput registerregister ATTRIBATTRIBcol0 col0 == fragment.color;fragment.color; PARAMPARAMhalf half == 0.5,0.5, 0.5,0.5, 0.5,0.5, 0.5;0.5; ## constantconstant OUTPUTOUTPUTout out == result.color;result.color; #Fetch#Fetch texturetexture TEXTEXtemp, temp, tex0,tex0, texture[0],texture[0], 2D;2D; #modulate#modulate andand writewrite outout colorcolor MULMULout, out, col0,col0, temp;temp; ENDEND GH Programmability 23 Michael Doggett Computing the Mandelbrot set Test each point on the complex number plane X dimension is the real component Y dimension is the imaginary component Z’ = Z2 + C C = starting position on complex plane Iterate until Z > 2 GH Programmability 24 Michael Doggett Mandelbrot main loop MUL pos.xy, curr, curr; # real component ADD pos.x, pos.x, -pos.y; # x2 –y2 + start.x ADD pos.x, pos.x, start.x; MUL pos.y, curr.x, curr.y; # imaginary component MAD pos.y, pos.y, two.x, start.y; DP3 magnitude, pos, pos; # calculate magnitude SUB magnitude, magnitude, four; # compare magnitude to 4 CMP escape.x, magnitude, 0, 1; ADD pos.z, pos.z, escape.x; MOV curr, pos; # ready next iteration GH Programmability 25 Michael Doggett Mandelbrot main loop MUL pos.xy, curr, curr; # real component ADD pos.x, pos.x, -pos.y; # x2 –y2 + start.x ADD pos.x, pos.x, start.x; MUL pos.y, curr.x, curr.y; # imaginary component MAD pos.y, pos.y, two.x, start.y; DP3 magnitude, pos, pos; # GH Programmability 26 calculate magnitude Michael Doggett # Mandelbrot main loop MUL pos.xy, curr, curr; # real component ADD pos.x, pos.x, -pos.y; # x2 –y2 + start.x ADD pos.x, pos.x, start.x; MUL pos.y, curr.x, curr.y; # imaginary component MAD pos.y, pos.y, two.x, start.y; DP3 magnitude, pos, pos; # GH Programmability 27 calculate magnitude Michael Doggett # Mandelbrot main loop MUL pos.xy, curr, curr; # real component ADD pos.x, pos.x, -pos.y; # x2 –y2 + start.x ADD pos.x, pos.x, start.x; MUL pos.y, curr.x, curr.y; # imaginary component MAD pos.y, pos.y, two.x, start.y; DP3 magnitude, pos, pos; # GH Programmability 28 calculate magnitude Michael Doggett # Mandelbrot main loop MUL pos.xy, curr, curr; # real component ADD pos.x, pos.x, -pos.y; # x2 –y2 + start.x ADD pos.x, pos.x, start.x; MUL pos.y, curr.x, curr.y; # imaginary component MAD pos.y, pos.y, two.x, start.y; DP3 magnitude, pos, pos; # GH Programmability 29 calculate magnitude Michael Doggett # Mandelbrot multipass demo Main loop is 11 instructions Need multiple iterations to get detail Multipass Render to texture Pass 1: Render output to framebuffer Set framebuffer data as texture Pass 2 to N: Read in previous result as texture Pass N+1: Draw texture on screen GH Programmability 30 Michael Doggett Mandelbrot demo GH Programmability 31 Michael Doggett Render to texture Surface Render data Transform Rasterization Fragment Output 1 FrameBuffer GH Programmability 32 Michael Doggett Render to texture Surface Render data Set data as texture Transform input to Fragment stage Rasterization Fragment Output 1 FrameBuffer GH Programmability 33 Michael Doggett Render to texture Surface Render data Set data as texture
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages55 Page
-
File Size-