Lecture 8: The Rasterization Pipeline (and its implementation on GPUs)

Computer Graphics CMU 15-462/15-662, Spring 2018 What you know how to do (at this point in the course)

y

(w, h) z y x z x (0, 0) Position objects and the Determine the position of Project objects onto camera in the world objects relative to the camera the screen

Sample triangle coverage Compute triangle attribute Sample texture maps values at covered sample points

CMU 15-462/662, Spring 2018 What else do you need to know to render a picture like this?

Surface representation How to represent complex surfaces?

Occlusion Determining which surface is visible to the camera at each sample point

Lighting/materials Describing lights in scene and how materials reflect light.

CMU 15-462/662, Spring 2018 Course roadmap

Introduction

Drawing a triangle (by sampling) Drawing Things Transforms and coordinate spaces Key concepts: Perspective projection and texture sampling Sampling (and anti-aliasing) Coordinate Spaces and Transforms Today: putting it all together: end-to-end rasterization pipeline

Geometry

Materials and Lighting

CMU 15-462/662, Spring 2018 Occlusion

CMU 15-462/662, Spring 2018 Occlusion: which triangle is visible at each covered sample point?

Opaque Triangles 50% transparent triangles

CMU 15-462/662, Spring 2018 Review from lastLecture class 5 Math

Assume we have a triangle defined(x, by y) the screen-space 2D position and distance (“depth”) from the camera of each vertex.

T p0x p0y ,d0 T ⇥p1x p1y⇤ ,d1 Lecture 5 Math T ⇥p2x p2y⇤ ,d2 ⇥ ⇤ How do we compute the depth of the triangle at covered sample point ( x, y ) ?

Interpolate it just like any other attribute that varies linearly overp the0x surfacep0y ,d 0 of the triangle. ⇥p1x p1y⇤ ,d1

⇥p2x p2y⇤ ,d2 CMU 15-462/662, Spring 2018 ⇥ ⇤ Occlusion using the depth-buffer (Z-buffer)

For each coverage sample point, depth-buffer stores depth of closest triangle at this sample point that has been processed by the renderer so far.

Closest triangle at sample point (x,y) is triangle with minimum depth at (x,y)

Initial state of depth buffer before rendering any triangles (all samples store farthest distance)

Grayscale value of sample point used to indicate distance Black = small distance White = large distance

CMU 15-462/662, Spring 2018 Depth buffer example

CMU 15-462/662, Spring 2018 Example: rendering three opaque triangles

CMU 15-462/662, Spring 2018 Occlusion using the depth-buffer (Z-buffer)

Processing yellow triangle: Grayscale value of sample point depth = 0.5 used to indicate distance White = large distance Black = small distance Red = sample passed depth test

Color buffer contents Depth buffer contents

CMU 15-462/662, Spring 2018 Occlusion using the depth-buffer (Z-buffer)

After processing yellow triangle: Grayscale value of sample point used to indicate distance White = large distance Black = small distance Red = sample passed depth test

Color buffer contents Depth buffer contents

CMU 15-462/662, Spring 2018 Occlusion using the depth-buffer (Z-buffer)

Processing blue triangle: Grayscale value of sample point depth = 0.75 used to indicate distance White = large distance Black = small distance Red = sample passed depth test

Color buffer contents Depth buffer contents

CMU 15-462/662, Spring 2018 Occlusion using the depth-buffer (Z-buffer)

After processing blue triangle: Grayscale value of sample point used to indicate distance White = large distance Black = small distance Red = sample passed depth test

Color buffer contents Depth buffer contents

CMU 15-462/662, Spring 2018 Occlusion using the depth-buffer (Z-buffer)

Processing red triangle: Grayscale value of sample point depth = 0.25 used to indicate distance White = large distance Black = small distance Red = sample passed depth test

Color buffer contents Depth buffer contents

CMU 15-462/662, Spring 2018 Occlusion using the depth-buffer (Z-buffer)

After processing red triangle: Grayscale value of sample point used to indicate distance White = large distance Black = small distance Red = sample passed depth test

Color buffer contents Depth buffer contents

CMU 15-462/662, Spring 2018 Occlusion using the depth buffer bool pass_depth_test(d1, d2) { return d1 < d2; } depth_test(tri_d, tri_color, x, y) {

if (pass_depth_test(tri_d, zbuffer[x][y]) {

// triangle is closest object seen so far at this // sample point. Update depth and color buffers.

zbuffer[x][y] = tri_d; // update zbuffer color[x][y] = tri_color; // update color buffer } }

CMU 15-462/662, Spring 2018 Does depth-buffer algorithm handle interpenetrating surfaces? Of course! Occlusion test is based on depth of triangles at a given sample point. The relative depth of triangles may be different at different sample points.

Green triangle in front of yellow triangle

Yellow triangle in front of green triangle

CMU 15-462/662, Spring 2018 Does depth-buffer algorithm handle interpenetrating surfaces? Of course! Occlusion test is based on depth of triangles at a given sample point. The relative depth of triangles may be different at different sample points.

CMU 15-462/662, Spring 2018 Does depth buffer work with super sampling? Of course! Occlusion test is per sample, not per pixel!

This example: green triangle occludes yellow triangle

CMU 15-462/662, Spring 2018 Color buffer contents

CMU 15-462/662, Spring 2018 Color buffer contents (4 samples per pixel)

CMU 15-462/662, Spring 2018 Final resampled result

Note anti-aliasing of edge due to filtering of green and yellow samples.

CMU 15-462/662, Spring 2018 Summary: occlusion using a depth buffer ▪ Store one depth value per coverage sample (not per pixel!) ▪ Constant space per sample - Implication: constant space for depth buffer ▪ Constant time occlusion test per covered sample - Read-modify write of depth buffer if “pass” depth test - Just a read if “fail” ▪ Not specific to triangles: only requires that surface depth can be evaluated at a screen sample point

But what about semi-transparent surfaces?

CMU 15-462/662, Spring 2018

CMU 15-462/662, Spring 2018 Representing opacity as alpha

Alpha describes the opacity of an object - Fully opaque surface: α = 1 - 50% transparent surface: α = 0.5 - Fully transparent surface: α = 0

Red triangle with decreasing opacity

α = 1 α = 0.75 α = 0.5 α = 0.25 α =0

CMU 15-462/662, Spring 2018 Alpha: additional of image (rgba)

α of foreground object

CMU 15-462/662, Spring 2018 Over operator:

Composite image B with opacity αB over image A with opacity αA

A A over B != B over A B B “Over” is not commutative A B over A A over B

Koala over NYC CMU 15-462/662, Spring 2018 Fringing Poor treatment of color/alpha can yield dark “fringing”:

foreground color foreground alpha background color

fringing no fringing CMU 15-462/662, Spring 2018 No fringing

CMU 15-462/662, Spring 2018 Fringing (…why does this happen?)

CMU 15-462/662, Spring 2018 Lecture 5 Math Lecture 5 Math d (x, y)=Ax + By + C wd (x, y)=Ax + By + C w T p0x p0y ,d0 T p p T ,d ⇥p01x p10yy⇤ ,d10 T ⇥p p ⇤T ,d ⇥p12x p21yy⇤ ,d21 LectureT 5 Math ⇥⇥p2x p2y⇤⇤ ,d2 C =⇥↵ B +(1⇤ ↵ )↵ A B B A C = ↵BB +(1d ↵B)↵AA Over operator:(x, y)= non-premultipliedAx + By + C alpha ↵ = ↵ +(1 ↵ )↵ C B w B A Composite image B with opacity αB over image A with opacity αA ↵ = ↵ +(1 ↵ )↵ C B B A A first attempt: T A = A A A T r g p0xb p0y ,d0 T A = ⇥A A A ⇤ r g⇥ b ⇤T A p1x T p1y ,d1 B B = ⇥Br Bg Bb ⇤ T T B = ⇥B B⇥p2Bx⇤ p2y⇤ ,d2 B over A r g b Appearance of semi- transparent A Composited⇥ color: ⇥ ⇤ ⇤ A C = ↵ B +(1 ↵ )↵ A B B B A

Appearance of What B lets through A over B semi-transparent B ↵C = ↵B +(1 ↵B)↵A A over B != B over A “Over” is not commutative

CMU 15-462/662, Spring 2018 Lecture 5 Math Lecture 5 Math

d (x, y)=Ax + By + C wd (x, y)=LectureAx + By + 5C Math w T p0x p0y ,d0 T p0xd p0y T ,d0 ⇥p1x (px,1y y⇤)=,dAx1 + By + C w T Lecture⇥p1x p1y 5⇤T Math,d1 ⇥pLecture2x p2y⇤ ,d 5 Math2 T ⇥p p ⇤ ,dT ⇥ 2x p2y0⇤x p0y2 ,d0 Cd =⇥d↵ B +(1⇤ ↵ )↵T A (x, yB)=⇥Axp1x+ BypB1y+⇤AC,d1 w (x, y)=1xAx +1yBy + C1 COver=w↵ B +(1operator:↵ )↵ A premultiplied alpha B ⇥ B ⇤TA ⇥p2x p2y⇤ ,d2 Composite↵C = ↵B +(1 imageT ↵ TBB with)↵A opacity αB over image A with opacity αA p0xp0px⇥0yp0y,d,d⇤0 0 Not premultiplied: ↵C Lecture= ↵B +(1 5↵ MathB)↵A ⇥ C⇥ = ↵⇤TB⇤+(1T ↵ )↵ A Non-premultipliedp1Cxp=1px 1↵yBpB1y,d+(1 alpha:,d1T 1↵B)↵AA A = Ar Ag Ab ↵ = ↵T +(1T ↵ )↵ ⇥p2x⇥p↵Cp2=yp⇤↵B,d⇤+(1,d2T ↵B)↵A dA = ⇥A2rx Ag2y Ab⇤ 2 (x, y)=Ax + By + C T A w⇥ ⇥ A =⇤ A⇤r ATg Ab B Not premultiplied: B = ⇥Br Bg Bb ⇤ ⇥ T ⇤T C =CB↵==BB⇥↵BBB+(1B=B+(1B↵rBB⇤)B↵↵BgA)A↵BAbA two multiplies, one add r g T b B over A p0x p0y ,d0 (referring to vector ops on colors) Premultiplied: ↵C = ↵B +(1⇥ ↵B)↵A ⇤ ⇥ ⇤ C0 = BT+(1 ↵B)A Premultiplied⇥p1x 0 p1 alpha:y⇤ ,dT 1 B A↵=C =Ar↵BA+(1g Ab ↵B)↵A T A = ↵ A ↵T A ↵ A ↵ A0 ⇥=p ↵AApr ⇤↵A,dAgT ↵AAb ↵A B = ⇥2Bxr B2yg Bb⇤ 2 ⇥ ⇤ T B0 = ⇥↵BBr ↵BBg ↵BBb ↵B⇤ Premultiplied: 0 ⇥ ⇥ B r ⇤ B ⇤g B b B C0 = ⇥B’+(1 ↵B)A’ ⇤ one multiply, one add C = ↵BB +(1 ↵B)↵AA T A0 = ↵ Ar ↵ Ag ↵ A ↵ CompositeA A alpha: A b A T B0 = ⇥↵ Br ↵ Bg ↵ B ↵ ⇤ Notice premultiplied alpha composites alpha just like how it composites rgb. ↵BC = ↵BB +(1 B↵Bb )↵AB Non-premultiplied alpha composites alpha differently than rgb. ⇥ ⇤

CMU 15-462/662, Spring 2018 A problem with non-premultiplied alpha

▪ Suppose we upsample an image w/ an alpha mask, then composite it onto a background ▪ How should we compute the interpolated color/alpha values? ▪ If we interpolate color and alpha separately, then blend using the non-premultiplied “over” operator, here’s what happens:

original original color alpha

upsampled upsampled color alpha

Notice black “fringe” that occurs because we’re blending, e.g., 50% blue pixels using 50% alpha, rather than, say, 100% blue pixels with 50% alpha.

composited onto yellow background CMU 15-462/662, Spring 2018 Eliminating fringe w/ premultiplied “over”

▪ If we instead use the premultiplied “over” operation, we get the correct alpha:

(1-alpha) background

+ =

upsampled color (1-alpha)*background composite image w/ no fringe

CMU 15-462/662, Spring 2018 Eliminating fringe w/ premultiplied “over”

▪ If we instead use the premultiplied “over” operation, we get the correct alpha:

composite image WITH fringe

CMU 15-462/662, Spring 2018 Similar problem with non-premultiplied alpha Consider pre-filtering (downsampling) a texture with an alpha Desired filtered result

α

input color input α filtered color filtered α filtered result Downsampling non-premultiplied alpha composited over white image results in 50% opaque brown)

0.25 * ((0, 1, 0, 1) + (0, 1, 0, 1) + Result of filtering (0, 0, 0, 0) + (0, 0, 0, 0)) = (0, 0.5, 0, 0.5) premultiplied image

CMU 15-462/662, Spring 2018 LectureLecture 5 Math 5 Math Lecture 5 Math

d d Lecture(x, y)=Ax 5+ MathBy + C (x,wd y)=Ax + By + C w (x, y)=Ax + By + C w T d p0x pT0y ,d0 p (x,Lecture yp)=Ax,d+T 5By Math+ C w0x p0x0y p0y T ,d0 0 ⇥p1x p1y⇤ ,d1 ⇥ ⇤T T p1x ⇥pp1y p ,d⇤T ,d1 ⇥pLecture1x p1y⇤T ,d 5 Math1 dp02xx p02yy ,d02 ⇥ ⇥(x, y)=⇤T ⇤AxT + By + C p2xw⇥pp2x2y p2,dy⇤T ,d2 2 ⇥p1x p1y⇤ ,d1 ⇥ ⇤ ⇥ C =d↵BB⇤+(1T ↵B)↵AA Not premultiplied: ⇥p2(xx, yp)=2y⇤ Ax,d+2By + C w T CMore= ↵BpB0x+(1 problems:p0y ↵B,d)↵AA0 applying “over” repeatedly C = ↵BB⇥ +(1 ⇤↵B)↵AA Not premultiplied: ↵ =⇥ ↵ +(1 ⇤T↵ )↵ ↵C C=Composite=↵CB↵ +(1pB1Bx+(1 imagep↵1yB↵) TC↵B,d )withA↵ AA1 opacity αC over B with opacity αB over image A with opacity αA B p0x p0y B ,dA 0 ↵ = ↵ +(1 ↵ )↵ C ⇥ B ⇤TTB A A =↵C A=rp↵⇥ B2Ax+(1g p2Ayb↵⇤TB,d)↵A 2 Non-premultipliedp1x p1y alpha,dT 1 is not closed under composition: A = Ar Ag Ab T A = Ar Ag ATb B = ⇥B⇥ B B⇤⇤T T r⇥p2xg p2yb⇤ ,d2 B A A = ⇥Ar Ag Ab⇤T B = ⇥B B B ⇤ Premultiplied: C⇥= ↵BrB +(1g ⇤ b ↵TB)↵AA B =⇥⇥Br Bg ⇤Bb ⇤ Premultiplied: C0 = B +(1⇥ ↵B)A⇤ T C CCB0 == B⇥↵BB+(1BB+(1↵B)⇤A↵B)↵AA r g Bb T A0 = ↵AAr ↵AAg ↵AAb ↵A ↵C = ↵B +(1 ↵B)↵AT A0 = ↵AAr ⇥ ↵AAg ↵AA⇤b ↵A ⇥ ⇤ T C over B over A B0 = ↵BB⇥r ↵↵B=Bg↵ ↵+(1BBb ↵↵B)↵⇤ T B0 = Consider↵BBCr ↵ resultBBBg of↵ BcompositingBb B↵BA 50% red over 50% red: T ⇥ T ⇤ C ⇥=C0=.750.75 0 0 0 0 Wait…⇤ this result is the premultiplied color! “Over” for non-premultiplied alpha takes non-premultiplied colors to premultiplied ↵⇥ =0⇥ .75 ⇤ ⇤ C colors (“over” operation is not closed) Cannot compose “over” operations on non-premultiplied values: over(C, over(B, A))

Q: What would be the correct UN-premultiplied RGBA for 50% red on top of 50% red?

CMU 15-462/662, Spring 2018 Summary: advantages of premultiplied alpha

▪ Simple: compositing operation treats all channels (RGB and A) the same ▪ More efficient than non-premultiplied representation: “over” requires fewer math ops ▪ Closed under composition ▪ Better representation for filtering (upsampling/ downsampling) textures with alpha channel

CMU 15-462/662, Spring 2018 Strategy for drawing semi-transparent primitives Assuming all primitives are semi-transparent, and RGBA values are encoded with premultiplied alpha, here’s one strategy for creating a correctly rasterized image:

over(c1, c2) { return c1.rgba + (1-c1.a) * c2.rgba; }

update_color_buffer( x, y, sample_color, sample_depth ) { if (pass_depth_test(sample_depth, zbuffer[x][y]) { // (how) should we update depth buffer here?? color[x][y] = over(sample_color, color[x][y]); } }

Q: What is the assumption made by this implementation? Triangles must be rendered in back to front order!

CMU 15-462/662, Spring 2018 Putting it all together Now what if we have a mixture of opaque and transparent triangles?

Step 1: render opaque primitives (in any order) using depth-buffered occlusion If pass depth test, triangle overwrites value in color buffer at sample

Step 2: disable depth buffer update, render semi-transparent surfaces in back-to-front order. If pass depth test, triangle is composited OVER contents of color buffer at sample

CMU 15-462/662, Spring 2018 End-to-end rasterization pipeline (“real-time graphics pipeline”)

CMU 15-462/662, Spring 2018 Lecture 5 Math Lecture 5 Math d (x, y)=Ax + By + C w d (x, y)=Ax + By + C w T p0x p0y ,d0 T T p0x ⇥pp10xy p,d1y⇤ 0,d1 T T ⇥p1x ⇥pp21xy⇤ p,d2y⇤ 1,d2 ⇥ ⇥ ⇤T ⇤ Not premultiplied: p2x p2y ,d2 ⇥C = ↵BB⇤+(1 ↵B)↵AA Not premultiplied: 1 C C= =↵BB +(1(↵BB +(1↵B)↵A↵AB)↵AA) ↵C 1 C = (↵↵CB=B +(1↵B +(1↵B)↵↵BA)A↵)A ↵C T ↵ = ↵A =+(1Ar ↵Ag )↵Ab C B B A T T A = BA=r ⇥ABgr BAgb Bb⇤ Goal: turn these inputsT into an image! Premultiplied: B = ⇥B ⇥B B ⇤ ⇤ Inputs: r g b C0 = B +(1 ↵B)A Premultiplied: ⇥ ⇤ list_of_positions = { list_of_texcoords = { T A0C=0 =↵BAA+(1r ↵AA↵gB)A↵AAb ↵A v0x, v0y, v0z, v0u, v0v, v1x, v1y, v1x, ⇥ v1u, v1v, T ⇤ T v2x, v2y, v2z, A0 =B↵0 =AA↵r BB↵Ar A v2u, v2v, g↵BB↵Ag Ab↵B↵BAb ↵B v3x, v3y, v3x, v3u, v3v, v4x, v4y, v4z, v4u, v4v, T T B0 = ⇥↵BB⇥r C↵=BB0g.75↵B 0Bb 0 ↵B⇤ ⇤ Texture map v5x, v5y, v5x }; v5u, v5v }; T ⇥ C = 0.75↵⇥C 0=0 0.75 ⇤ ⇤ Object-to-camera-space transform: T = P ↵⇥C =0.75 ⇤

Perspective projection transformT = P

Size of output image (W, H)

At this point we should have all the tools we need, but let’s review…

CMU 15-462/662, Spring 2018 Step 1: Transform triangle vertices into camera space

y

z

x

CMU 15-462/662, Spring 2018 T T x = xx/ xz xy/ xz T x2D =2Dxx/ xz xy/ xz Step 2: x2D = xx/ xz xy/ xz ⇥ ⇤ T ⇥ ⇤ ⇥x2D = xx/ xz x⇤y/ xz Apply perspective projection transform to transform triangletan(✓tan(/2) vertices✓/2) tan(✓⇥/2) ⇤ T x2DT= xx/ xz xy/ xz x2D =T xx/ xz xy/ xz into normalized coordinatetan(✓/ 2)spacex = x / x x / x 2D x z y z aspect⇥ aspecttan(✓tan(/2)✓/⇤2) T ⇥ ⇤ x⇥2D = xx/ xz xy/ xz T aspect tan(✓/2) ⇥ x2D = xx/ xz xy/ xz ⇥ ⇥ ⇤ tan(✓/2) tan(✓/2) ⇥ ⇤ aspect tan(✓/2) y ⇥ ⇤ ⇥ tan(✓/2) x xx1 xx2 xx3 xx4 xx5 xx6 xx7 x8 T T 1 2 3 4 5 6 tan(7 ✓/2)8 x1 x2 x3 x4 xx5 x=6 xxx/7= xxx8x/x x/z xxy/ xz tan(✓/2) 2D 2Dx z y z aspect tan(✓/2) aspect tan(✓/2) ⇥ Lecture 3 Math ⇥ ⇥ ⇤ ⇤ T ⇥ y x1 x2 x3 x4 x5 x6xaspect2Dx=7 xxxtan(/8 x✓/z 2)xy/ xz tan(✓tan(/2)✓/2) tan( ✓/2) ⇥ aspect tan(✓/2) tan(✓/2)tan(✓/2) T aspect ⇥ x = xx/ xz xy/ xz aspect aspect tan(✓/2) 2D ⇥ ⇤ x1 x2 x3 x4 x5 xT6 x7 x8 T aspect x = xx/ xz xy/ xz ⇥ x1 x2 x3 x4 x5 2Dx6 x7 x8x2D= xx/ xz xy/(1, x1,z 1) tan(✓/2) ⇥ tan(✓/2) ⇤ z x1 x2 x3 x4 x5 x6 x7 x8 ⇥ x1 ⇥x2 ⇤x3 x4 x5 ⇤ x6 x7 x8 aspectaspect aspecttan( ✓/2)tan(✓/2) x0 x1 x2 x3 y tan(✓/2) (-1,-1,-1)tan(✓/x2)1 x2 x3 x4 x5 x6 x7 x8 ⇥ ⇥ tan(✓/2) tan(✓/2) aspect tan(✓/2) tan(aspect✓/2) tan(✓/2) aspect z x + y ✓ ⇥ tan(✓/2) x1 x2x1 x3x2 x4x3 x5x4aspectx6x5 x7x6 x8x7 x8 tan(✓/2) aspect tan(✓/2) aspect ⇥ aspect tan(✓/aspect2) tan(✓/2)aspect ⇥ x Pinhole x1 x2 x3 x4 x5 x6 x7 x8 x ⇥ Camera tan(✓/2)tan(✓/2) x1 x2 x3 x4 x5 x6 x7 x8 f(x) znearaspect aspect (0,0) x1 x2 x3 xx4 xx5 xx6 xx7 xx8 x x x tan(✓/2) 1 2 3 4 5 6 7 8 Camera-space positions: 3D tan(aspect✓/2) Normalized space positions aspect tan(✓/2) tan(✓/2) f(x + y)=f(x)+f(y) aspect aspect

f(ax)=af(x)

CMU 15-462/662, Spring 2018 f(x)=af(x)

Scale: Sa(x)=ax

S (x) S (x ) S (x ) S (x S (ax) aS (x) S (x) S (y) S (x + y) 2 2 1 2 2 2 3 2 2 2 2 2

S (x)=2x 2 2 2 2

aS2(x)=2ax 2 S2(ax)=2ax 2 2 S2(ax)=aS2(x) 2 2 2 S2(x + y) = 2(x + y) 2 2 S2(x)+S2(y)=2x +2y 2 S (x + y)=S (x)+S (y) 2 2 2 2 2 Rotations: 2 R (x) R (x ) R (x ) R (x ) R (x ) R (ax) aR (x) R (y) R (x + y) ✓ ✓ 0 ✓ 1 ✓ 2 ✓ 3 ✓ ✓ ✓ ✓ Translation: T (x ) T (x ) T (x ) T (x ) a,b 0 a,b 1 a,b 2 a,b 3 Step 3: clippingT T T x2D =x2Dx=x/ xxxz/ xzy/ xxyz/ xz T x =x2Dx=/ xxx/ xz / xxy/ xz 2D x z y z ▪ Discard ⇥triangles⇥ that lie⇤ complete⇤ outside the unit cube (culling) ⇥ ⇥ ⇤ ⇤ - They aretan( o✓tan(ff/ 2)screen,✓/2) don’t bother processing them further tan(✓tan(/2)✓/2) T x = xx/ xz xy/ xz T ▪ Clip2D triangles that extend beyond the unit cube to thex cube= xx/ xz xy/ xz 2D T aspect⇥ aspecttan(✓tan(/2)✓/⇤2) T - (possibly ⇥generatingx⇥2D = xx /newxz triangles)xy/ xz T aspect⇥ aspecttan(x ✓tan(/=2)✓x/⇤2)/ x x / x x = xx/ xz xy/ xz ⇥2D x z y z T tan(✓/2) 2D ⇥ x = x / x x / x ⇥ ⇤ tan(✓/2) 2D x z y z y ⇥ ⇤ ⇥ ⇤ x x x x x x x x y ⇥ ⇤ x1 x12 x23 x34 x45 x56 xtan(67✓x/72)8 8 x x x x x x x x tan(✓/2) x1 x12 x23 x34 x45 x56 xtan(67✓x/72)8 8 aspect tan(✓/2) tan(✓/2) ⇥ aspect tan(✓/2) tan(✓tan(/2)✓/2) ⇥ aspect tan(✓/2) tan(✓tan(/2)✓/2) aspectaspect aspect⇥ tan(✓/2) aspect tan(✓/2) x1 x2 x3 x4 x5 xT6 x7 x8⇥ T aspectaspect aspect⇥ tan(✓/2) x2D = xx/ xxz 2Dx=y/ xxxz/ xz xy/(1, 1,x 1)z x1 x2 x3 x4 x5 xT6 x7 x8⇥ T x2D = xx/ xxz x=y/ xxxz/ xz xy/(1, 1,x 1)z z 2D ⇥ x1 ⇥x2 ⇤x3 x4 x5 ⇤ x6 x7 x8 z (-1,-1,-1)tan(✓/x2) x x x x x x x ⇥ x1 ⇥x2 ⇤x3 x4 x5 ⇤ x6 x7 x8 1 2 3 4 5 6 7 8 x x x x x x x x tan(✓/2) tan(✓/2) (-1,-1,-1)tan(✓/2)1 2 3 4 5 6 7 8 aspect tan(✓/2) tan(✓/2) tan(✓/2) aspect tan(✓/2) tan(✓/2) aspect tan(✓/2) aspect tan(✓aspect/2) tan(✓/2)aspect aspect ⇥ x ⇥ aspect tan(✓aspect/2) tan(✓/2)aspect ⇥ x ⇥ x1 x2 x3 xx1 4 xx2 5 xx3 6 xx4 7 xx5 8 x6 x7 x8 x1 x2 x3 xx4 xx5 xx6 xx7 xx8 x x x 1 2 3 4 5 6 7 8

tan(✓/2) tan(✓/2) Triangles after clipping Triangles before clipping tan(✓/2) tan(✓/2) aspect aspect aspect aspect CMU 15-462/662, Spring 2018

2 2 2 2

2

2 2

2 2 2

2 2 2 2 Step 4: transform to screen coordinates

Perform homogeneous divide, transform vertex xy positions from normalized coordinates into screen coordinates (based on screen w,h)

(w, h)

(0, 0)

CMU 15-462/662, Spring 2018 Lecture 5 Math LectureLecture 5 Math 5 Math Lecture 5 Math d (x, y)=Ax + By + C LectureLecture 5 Math 5 Math w d d (x, y)=(x,Ax y)=+AxBy++ByC + C w w Lecture 5 Mathd T (x, y)=Ax + By + C p p ,d d d w 0x 0y 0 (x, y)=(x,Ax y)=+ AxBy++ByC + C T p0x Tp0y ,d0 T w w p0x p0y ,d⇥p0 1x p1y⇤ ,d1 d T T p p ,d⇥ T ⇤ (x, y)=Ax + By + C0x 0y ⇥ 0 p1x⇤ p1y ⇥ ,d1 ⇤T w T T p1x p1y ,dp1 2x p2y ,d2 p0x pp00yx ⇥p,d0y 0 ,d⇤0T T p1x p1y ,d1⇥p2x Tp2y⇤ ,d2 ⇥p2x p2y⇤ ,d⇥ 2 ⇤ Not⇥ premultiplied:⇥T ⇤T ⇤T p pp1x pp,d11yx ⇥p,d1y 1 ,d⇤1T 0x 0y 0p2x p2y ,d2⇥ C =⇤ ↵ B +(1 ↵ )↵ A Not premultiplied: ⇥ ⇤ B B A Not premultiplied: T T T ⇥ ⇥p2x ⇤⇥pp2y⇤ p,d⇤ 2 ,d p1x p1y ,d2x ⇥1 2y ⇤2 C = ↵BB +(11 ↵B)↵AA C = ↵BB +(1C =↵B)↵(↵AAB +(1 ↵ )↵ A) Not premultiplied: B B A ⇥ ⇥T ⇤ ⇤ 1 ↵C ⇥p p ⇤ ,dC = ↵BB +(1 1 ↵B)↵AA Not premultiplied:Not premultiplied: 2x 2y 2 C = C =(↵ B(+(1↵BB +(1↵ )↵↵AB)↵AA) ↵BC ↵C B=↵AB +(1 ↵B)↵A C = ↵CBB= +(1↵BB1+(1↵B)↵A↵AB)↵↵CAA ⇥ ⇤C = (↵BB +(1 ↵B)↵AA) T Not premultiplied: ↵C = ↵B +(1 ↵B)↵A 1 1 ↵C ↵C = ↵B +(1 ↵AB)=↵AAr Ag Ab C C= =↵ BC +(1=(↵BB(+(1↵↵BB)↵+(1↵AB)↵A↵AB))↵AA) T B↵ B A T T C ↵C ↵C = ↵B +(1 ↵BA)↵=A Ar Ag ⇥Ab ⇤ 1 A= Ar Ag ABb = Br Bg Bb C = (↵↵C B=↵+(1↵CB =+(1↵↵B +(1)↵↵B)A↵)A↵B)↵A T T B AB= AAr Ag A ⇥ T ⇤ ↵C Premultiplied: ⇥bB = Br Bg⇤ ⇥Bb ⇤ T B = Br Bg Bb A = A A A T ↵ = ↵ +(1Ar= ↵Ag)r↵ A⇥bg Ab ⇤T C0 = B +(1 ↵B)A Premultiplied:Premultiplied:C B BB =A Br Bg B⇥b ⇥ ⇤ ⇤ T T T T B = ⇥Br ⇥Bg B ⇤ ⇤ C0 =A0B=+(1↵AAr↵B↵)AAg ↵AAb ↵A Premultiplied: A = Ar ABg= ABb r B⇥bg Bb C0 = B⇤ +(1 ↵B)A T T T C0T= B +(1A0 =↵B↵)AAAr ↵AAg ↵AAb ↵A Premultiplied:Premultiplied: ⇥ ⇥ A⇤0 = ↵⇤AAr ↵ABA0 g= ⇥↵↵ABABbr ↵↵ABBg ↵BBb ↵B⇤ B = ⇥Br Bg Bb⇤ C0 = BC +(1= B +(1↵B)A↵ )A T T A 0= ↵ A ↵ AB ↵⇥ A ↵ T ⇤ T 0 A r A B⇥ g0 = A↵BbBr A↵B⇥ BCg =↵B0B.75b⇤ ↵ 0B 0 ⇤ Premultiplied: ⇥ ⇤ B0 = ↵BBT r ↵TBBg ↵BBb ↵B A0 = A↵0A=Ar↵A↵AArAg↵A↵AAgAb ↵A↵AAb ↵A T T C0 = B +(1B = ↵⇥↵B)AB ↵ B ↵⇥ B ↵ ⇤ T ⇤ Step 5: setup triangle0 B (triangler B ⇥ g preprocessing)C =B C0b.75= B0 0.75 0 0↵ 0⇥C =0⇤ .75 ⇤ ⇥ T ⇤ T T B0 = B↵0B=Br⇥↵B↵BBrBg↵B↵BBgBb↵B↵BBb ↵BT⇤ BeforeA0 = rasterizing↵AA triangle,r ↵ canAA computeg ↵⇥ aA bunchACb= ↵0A.75 0 0 ↵⇥C⇤=0.75 ⇤T = P ↵⇥C =0.75 ⇤ of data that will be used by all fragments, e.g., T T ⇥ ⇥ C =⇥ C0.75= 0 0.75 0 0⇤ T 0 ⇤ ⇤ B• triangle0 = edge↵B equationsBr ↵ BBg ↵BBb ↵↵⇥CB =0.75 ⇤ T = PT = P • triangle attribute equations ↵ =0.75 ⇥ ⇥C ↵⇥CT=0⇤.75T⇤=⇤P E (x, y)=E (x, y)=E (x, y) • etc. C = 0.75 0 0 01 12 20 T = PT = P E01(x, y)=E12(x, y)=E20(x,1 y) ↵⇥C =0.75 ⇤ E01(x, y)=EU12((x,x, y y)=)=VE(20x,(x, y)= y) (x, y)=Z(x, y) 1 w T =EP01(x, y)=E12U(x,(x, y y)=)=EV20((x,x, y1 y)=) (x, y)=Z(x, y) U(x, y)=V(x, y)= (x, yw)=Z(x, y) E01(x,E y)=(x,E y)=12(x,E y)=(x,E y)=20(x,E1 y) (x, y) w 01U(x, y)=12V(x, y)= 20(x, y)=Z(x, y) 1 1 w E U((x,x, y yU)=)=(x,EV y)=(x,(x, yV y)=)=(x, yE)=(x,(x, y)= y(x,) Z y()=x, yZ) (x, y) 01 12 w 20 w 1 U(x, y)=V(x, y)= (x, y)=Z(x, y) w

CMU 15-462/662, Spring 2018 Step 6: sample coverage Evaluate attributes z, u, v at all covered samples

CMU 15-462/662, Spring 2018 Step 6: compute triangle color at sample point e.g., sample texture map *

u(x,y), v(x,y) v

u

* So far, we’ve only described computing triangle’s color at a point by interpolating per-vertex colors, or by sampling a texture map. Later in the course, we’ll discuss more advanced algorithms for computing its color based on material properties and scene lighting conditions. CMU 15-462/662, Spring 2018 Step 7: perform depth test (if enabled) Also update depth value at covered samples (if necessary)

PASS PASS PASS FAIL PASS PASS FAIL PASS PASS PASS

FAIL FAIL PASS PASS PASS FAIL FAIL PASS PASS PASS

CMU 15-462/662, Spring 2018 Step 8: update color buffer (if depth test passed)

CMU 15-462/662, Spring 2018 OpenGL/Direct3D graphics pipeline * Structures rendering computation as a series of operations on vertices, primitives,

fragments, and screen samples 3 1 4 Input: vertices in 3D space 2 Operations on Vertex Processing vertices Vertex stream Vertices in positioned in normalized Operations on Primitive Processing coordinate space primitives (triangles, lines, etc.) Primitive stream Fragment Generation Triangles positioned on screen (Rasterization)

Operations on Fragment stream fragments Fragments (one fragment per covered sample) Fragment Processing

Shaded fragment stream Shaded fragments Operations on Screen sample operations (depth and color) screen samples Output: image (pixels)

* Several stages of the modern OpenGL pipeline are omitted CMU 15-462/662, Spring 2018 OpenGL/Direct3D graphics pipeline *

3 1 4 Input vertices in 3D space 2

Operations on Vertex Processing transform matrices vertices Vertex stream

Operations on Primitive Processing primitives (triangles, lines, etc.) Primitive stream textures Fragment Generation (Rasterization) Operations on Fragment stream Pipeline inputs: fragments Fragment Processing - Input vertex data

Shaded fragment stream - Parameters needed to compute position on vertices in normalized coordinates (e.g., transform matrices) Operations on Screen sample operations - Parameters needed to compute color of fragments (depth and color) screen samples (e.g., textures) - “” programs that define behavior of vertex and fragment stages

* several stages of the modern OpenGL pipeline are omitted CMU 15-462/662, Spring 2018 Shader programs Define behavior of vertex processing and fragment processing stages Describe operation on a single vertex (or single fragment)

Example GLSL fragment shader program Shader function executes once uniform sampler2D myTexture; Program parameters per fragment. uniform vec3 lightDir; Per-fragment attributes varying vec2 uv; (interpolated by rasterizer) Outputs color of surface at varying vec3 norm; sample point corresponding to void diffuseShader() fragment. { Sample surface albedo (this shader performs a texture lookup to (reflectance color) from texture obtain the surface’s material color at this point, vec3 kd; then performs a simple lighting computation) kd = texture2d(myTexture, uv); kd *= clamp(dot(-lightDir, norm), 0.0, 1.0); gl_FragColor = vec4(kd, 1.0); }

Shader outputs surface color Modulate surface albedo by incident irradiance (incoming light) CMU 15-462/662, Spring 2018 Goal: render very high complexity 3D scenes - 100’s of thousands to millions of triangles in a scene - Complex vertex and fragment shader computations - High resolution screen outputs (2-4 Mpixel + supersampling) - 30-60 fps

Unreal Engine Kite Demo (Epic Games 2015) CMU 15-462/662, Spring 2018 Graphics pipeline implementation: GPUs Specialized processors for executing graphics pipeline computations

Discrete GPU card (NVIDIA GeForce Titan X)

Integrated GPU: part of modern Intel CPU die CMU 15-462/662, Spring 2018 GPU: heterogeneous, multi-core processor T-OP’s of fixed-function Modern GPUs offer ~2-4 TFLOPs of performance for compute capability over here executing vertex and fragment shader programs

SIMD SIMD SIMD SIMD Texture Texture Exec Exec Exec Exec

Cache Cache Cache Cache Texture Texture

SIMD SIMD SIMD SIMD Tessellate Tessellate Exec Exec Exec Exec

Tessellate Tessellate Cache Cache Cache Cache GPU Clip/Cull Clip/Cull Rasterize Rasterize Memory SIMD SIMD SIMD SIMD Exec Exec Exec Exec Clip/Cull Clip/Cull Rasterize Rasterize Cache Cache Cache Cache

Zbuffer / Zbuffer / Zbuffer / Blend Blend Blend SIMD SIMD SIMD SIMD Exec Exec Exec Exec Zbuffer / Zbuffer / Zbuffer / Blend Blend Blend Cache Cache Cache Cache Scheduler / Work Distributor

CMU 15-462/662, Spring 2018 Summary

▪ Occlusion resolved independently at each screen sample using the depth buffer ▪ Alpha compositing for semi-transparent surfaces - Premultiplied alpha forms simply repeated composition - “Over” compositing operations is not commutative: requires triangles to be processed in back-to-front (or front-to-back) order

▪ Graphics pipeline: - Structures rendering computation as a sequence of operations performed on vertices, primitives (e.g., triangles), fragments, and screen samples - Behavior of parts of the pipeline is application-defined using shader programs. - Pipeline operations implemented by highly, optimized parallel processors and fixed-function hardware (GPUs)

CMU 15-462/662, Spring 2018