Fourth Eurographics Workshop on Parallel Graphics and Visualization (2002) D. Bartz, X. Pueyo, E. Reinhard (Editors)

Distributed rendering of interactive soft shadows

M. Isard†, M. Shand and A. Heirich

Compaq Computer Corporation

Abstract

Recently several distributed rendering systems have been developed which exploit a cluster of commodity comput- ers by connecting host graphics cards over a fast network to form a compositing pipeline. This paper introduces a new algorithm which takes advantage of the programmable compositing operators in these systems to improve the performance of rendering multiple shadow-maps, for example to produce approximate soft shadows. With an nVidia GeForce4 Ti graphics card the new algorithm reduces the number of required render nodes by nearly a fac- tor of four compared with a naive approach. We show results that yield interactive-speed rendering of 32 shadows on a 9-node Sepia2a distributed rendering cluster.

Categories and Subject Descriptors (according to ACM CCS): I.3.7 []: 3-Dimensional Graphics and Realism

1. Introduction from each light in turn). Hardware-accelerated shadow map- ping also consumes at least one texture unit which is a scarce There is an extensive literature on rendering shadows at in- resource in current graphics cards. teractive rates. The two most popular methods are shadow volumes1 and shadow maps12 both of which can be imple- mented partially or completely in the current generation of Several distributed rendering systems have recently been programmable commodity graphics cards, such as the nVidia developed which exploit a cluster of commodity comput- GeForce4 and the ATI 8500. Shadow volumes can ers, each with a host graphics card and linked by a fast be used to cast accurate hard shadows without aliasing ar- network5, 11, 8. Typically these systems can be programmed tifacts, but there is some extra cost in preprocessing the with a range of simple compositing operators which com- geometry3 and if the scene is made up of many small objects, bine each locally rendered with a remote pixel from for example the leaves of a plant, performance can become one or more nodes, and output the transformed result to sub- limited by stencil-buffer fill rate. It is also difficult to modify sequent nodes in the rendering pipeline. The available com- shadow volume techniques to deal with effects such as hair positing operators usually include for example depth com- and fog6. pare, alpha blending and antialiasing. A given application programs the nodes under its control with an appropriate set Shadow maps remove some of the limitations of shadow of compositing operators to create a rendering pipeline that volumes at the cost of introducing aliasing artifacts due to meets that application’s overall rendering needs. mismatched projection resolutions in the shadow map and the eye view12, 6. Shadow maps become inefficient when ren- This paper presents an algorithm which uses a custom dering multiple shadows however, since 2L rendering passes compositing operator to render a shadow mapped scene on a are needed to render a scene lit by L point sources (L passes distributed rendering cluster. Illumination by L point sources to render the shadow maps, and L passes feeding into an ac- can be rendered by L K 1 nodes where K is the number cumulation buffer to composite the illumination information ( / ) + of texture units on each graphics card. For walkthrough ap- plications each node requires a single rendering pass, while for scenes with moving lights or geometry K + 1 passes are † Now at Microsoft Research needed per node. In addition all K texture units are avail-

c The Eurographics Association 2002. Isard, Shand and Heirich / Distributed Soft Shadows able for rendering material properties allowing a full range per-pixel illumination value s ∈ [0,1] which varies from of multi-texture material effects. s = 0 for points completely in shadow to s = 1 for points which are completely illuminated. Programmable fragment available on high-end commodity graphics cards can 2. Shadow mapping then be used to set the final pixel value p f = s.p where p is Shadow maps were proposed by Lance Williams in 197812 the fully-illuminated pixel colour. and have become increasingly popular for interactive shad- ows as hardware support for rendering them has become When K > 1 texture units are available on a graphics card prevalent10, 4. We will briefly describe how shadow maps are it seems attractive to render multiple shadows in one pass by implemented when depth textures are available, for example generating K shadow maps from the viewpoints of K differ- using the ARB_shadow OpenGL extension. Shadow maps ent lights and binding each map to a different texture unit. can also be implemented somewhat less efficiently without It is then straightforward to compute sk, the illumination co- efficient for each light, however computing the final pixel depth textures, for example using the ATI Radeon 8500, and f there is a brief discussion of the tradeoffs involved in sec- value p is problematic since the rendering pipeline has al- tion 4.2. ready summed the lighting contributions from the K lights into a single pixel colour p. This difficulty is overcome by Following standard notation10 we introduce three homo- decomposing the final pixel colour into illumination and ma- geneous coordinate systems; clip coordinates, world coor- terial properties as described in the following section. dinates and light coordinates. The scene’s geometry is ex- pressed in world coordinates x = (x,y,z,w)T . The geom- etry can be projected into the eye’s viewpoint using the 3. A lighting compositing operator projective transformation matrix FcMc to give clip coor- In recent years several programmable distributed render- dinates xc = (xc,yc,zc,wc)T = FcMcx. Similarly the pro- ing frameworks have been developed, including Sepia25, jective transformation FlMl converts world coordinates to Lightning-211 and PixelFlow8. Rendering nodes are ar- light coordinates xl = (xl,yl,zl,wl)T = FlMlx. Following ranged in a linear pipeline and each node generates an image OpenGL coordinate-transformation conventions, Mc and Ml of pixel values which are computed as a function of a lo- are typically rigid-body transformations effecting scale and cally rendered image and the output of the preceding render rotation and Fc and Fl are projections to a clip frustum. node in the pipeline. This function is known as a compositing A scene illuminated by a single point light source is ren- operator and can be programmed in an application-specific dered in two passes. First the scene is rendered from the manner. We have taken advantage of this programmability viewpoint of the light source and the resulting image (the to design a new compositing operator suitable for the dis- shadow map) is stored in a depth texture where T(u,v) is the tributed rendering of global-illumination properties such as depth value stored at coordinate (u,v). The shadow map is shadows. used to store the scene depth projected into light-view coor- Typically the locally rendered image is captured by the dinates, so card using a DVI interface and each pixel consists of a xl yl zl 24-bit (r,g,b) triplet. The data layout of the compositing T( , ) = . wl wl wl function has more flexibility and in general a network pixel can contain more than 24 bits of data to be interpreted in Next the scene is rendered again from the eye’s viewpoint an application-specific manner. We use this flexibility to with the shadow map bound to one of the graphics card’s decompose the pixel colour information into illumination- texture units. During polygon rasterisation the texture coor- dependent and material-dependent channels which can then dinates xt = (xt ,yt ,zt ,wt ) at pixel xc are generated using the l l c c −1 be recombined in a final compositing step as described in texture matrix F M (F M ) and thus transformed to light the following paragraphs; implementations using OpenGL coordinate space. At each pixel the texture hardware is used on specific commodity graphics cards are given in section 4. to read a value A typical pixel lighting equation, taken from the OpenGL xt yt zm = T( , ) specification, gives the final colour p f of a pixel illuminated wt wt by L lights as from the shadow map and a depth comparison is done; if zt /wt < zm + ε then the pixel is considered to be illumi- p f = e + ma × ca+ nated, otherwise the pixel is in shadow. The depth bias ε is L a a a d d d s s s (1) included to reduce self-shadowing artifacts and should be ∑ (m × cλ)iλ + (m × cλ)iλ + (m × cλ)iλ . 9 λ chosen based on the geometry of the model being used . =1   Percentage closer filtering9 can be achieved by en- Here e is the light emitted by the material, ma, md and abling linear filtering on the texture map using e.g. the ms are the ambient, diffuse and specular material colours ARB_depth_texture OpenGL extension, resulting in a respectively and ca is the global ambient illumination.

c The Eurographics Association 2002. Isard, Shand and Heirich / Distributed Soft Shadows

Each light λ has ambient, diffuse and specular illumination which can be composited with the material colours of the a d s a d s colours cλ, cλ and cλ respectively, and iλ, iλ and iλ are ambi- scene in up to four material nodes at the end of the pipeline. ent, diffuse and specular attenuation coefficients which de- The material nodes compute the following pixel colour pend on per-pixel lighting parameters such as the location triplets and compositing operations: and orientation of the illuminated object, spotlighting, fog, Specular material node N + 1: S = (ms,ms ,ms ) etc. In this notation a bold-face variable u refers to a colour r g b a a d d s s vector (ur,ug,ub) and u × v denotes component multiplica- pN+1 = pN , pN+1 = pN , pN+1 = pN × S tion (urvr,ugvg,ubvb). Diffuse material node N + 2: D = (md,md,md) The lighting equation can be modified to include shad- r g b a a ds d s owing effects by including shadowing coefficients sλ as fol- pN+2 = pN+1, pN+2 = pN+1 × D + pN+1 lows: Ambient material node N + 3: A = (ma,ma,ma) p f = e + ma × ca+ r g b ads a a ds L pN+3 = pN+2 × (c + A) + pN+1 a a a d d d s s s ∑ (m × cλ)iλsλ + (m × cλ)iλsλ + (m × cλ)iλsλ e e e λ Emissive material node N + 4: E = (mr,mg,m ) =1   b f ads which can be rewritten as p = E + pN+3 p f = e + ma × (ca + Ia) + md × Id + ms × Is (2) where ca is a constant ambient lighting colour programmed into node N + 3 on a per-frame or per-scene basis. Note that where no shadow or lighting computations are done on any of the L L L a a a d d d s s s material nodes, so all texture units are available for rendering I = ∑ c i sλ, I = ∑ c i sλ, I = ∑ c i sλ. λ λ λ λ λ λ the material properties of the scene. λ=1 λ=1 λ=1 Obvious simplifications to the lighting model can be made Since Ia, Id and Is do not depend on ma, md or ms this which reduce the number of material nodes. For example, suggests a strategy for partitioning the compositing pipeline photorealistic rendering usually assumes no ambient light- into illumination nodes which take account of lighting pa- ing, which removes the need for node N + 3. Scenes which rameters and shadowing and material nodes which are pro- do not include light-emitting materials can be rendered with- grammed with the material properties of the scene objects. out node N + 4. A final simplification can be made if the Given a pipeline in which each render node has K active specular material colour ms is the same for all objects. In textures, we will assign N nodes to be illumination nodes this case (assuming no ambient or emissive lighting), the allowing NK distinct light sources. We make the somewhat compositing function in the illumination nodes is modified limiting assumption that the lights can be partitioned into N to compute N subsets (Ln)n=1 each of size K such that all the lights in a d d d d s s s s s given subset are the same colour, i.e. pn = pn−1 + cnIn , pn = pn−1 + (m × cn)In (4) a a d d s s and only a single material node is needed which computes (cλ = cn, cλ = cn, cλ = cn) ∀ λ ∈ Ln. d d d f d s This assumption is reasonable for many scenes, in particular D = (mr ,mg,mb), p = pN × D + pN . (5) when soft shadows are being simulated by placing multi- Our implementation in fact also provides limited support for ple point light sources at sample positions on an area light a d global ambient illumination when c = cn for some n and source. either ma = 0 or ma = md for all materials in the scene. The illumination node is then programmed so that the Illumination node n is then programmed to compute a d s colour at each pixel is given by the triplet (In ,In ,In) where d d d d a s s s s s pn = pn−1 + cn(In + I ), pn = pn−1 + (m × cn)In (6) a a d d s s In = ∑ iλsλ, In = ∑ iλsλ, In = ∑ iλsλ. (3) where λ∈Ln λ∈Ln λ∈Ln 1 if ma = md Ia = (7) The compositing function at illumination node n com- 0 otherwise a d s ( putes three colours, pn, pn and pn where and we use this for example to simulate the light-emitting a a a a d d d d s s s s pn = pn−1 + cnIn , pn = pn−1 + cnIn , pn = pn−1 + cnIn material in the bottom image of figure 3 (see color plates). a d s and cn, cn and cn are constants programmed into the com- positing hardware on a per-frame or per-scene basis. 4. Implementation The output of node N, the final illumination node in the We have implemented the lighting compositing operator (6) a d s pipeline, is an image of 9-component (pN ,pN,pN ) in the Sepia2a parallel rendering framework using nVidia

c The Eurographics Association 2002. Isard, Shand and Heirich / Distributed Soft Shadows

GeForce4 Ti 4600 cards. (At the time of writing a Sepia It is straightforward to persuade OpenGL to place the de- cluster populated with nVidia GeForce4 Ti 4600 cards is not sired information in the primary and secondary colour chan- yet available so results shown are based on a cycle accu- nels. All material diffuse and specular RGB values are set to rate simulated execution of the compositing hardware de- (1,1,1), while the other specular parameters such as shini- signs.) We have also investigated the feasibility of using ness are set according to the desired material properties for ATI Radeon 8500 cards and this is briefly discussed in sec- each object. Materials for which ma = md have their al- tion 4.2. Sepia2a is based on the Sepia2 distributed rendering pha diffuse material colour set to 1, otherwise it is set to 0. 5 architecture but supports transmission of the local image Three lights are enabled at locations c1, c2 and c3 with dif- from a graphics card directly to the Sepia PCI card using a fuse and specular colours both set to (1,0,0,0), (0,1,0,0) DVI interface without passing through the host PCI bus. The and (0,0,1,0) respectively and programmed with the de- natural size for a network pixel is 64 bits and illumination sired parameters for attenuation, spotlighting, etc. Details of d s node n must compute pn and pn, for which we allocate 11 programming the fragment shaders are given in sections 4.1 and 10 bits per channel respectively leaving 1 bit unused. and 4.2. Illumination node n renders K lights in K + 1 passes. The The material node images are all straightforward to gen- first K passes are used to generate shadow-maps from the erate. No special graphics card programming is required; viewpoint of each light in turn, and details for the two graph- the scene is rendered from the eye’s viewpoint with lighting ics cards are given in sections 4.1 and 4.2 respectively. The disabled and object colours set to the appropriate material d s final pass renders the image which will be sent to the Sepia colour m or m . card for compositing. The illumination-node compositor computes (6) so the 4.1. nVidia GeForce3/4 Ti d a s host graphics card must supply (In + I ) and In defined We have implemented the illumination-node code on in (3) and (7). The K texture units can be programmed to an nVidia GeForce4 Ti 4600 graphics card. The pro- d s generate the sk, so it remains to generate ik , ik for each gramming model is identical for other cards in the light along with Ia and combine the coefficients. Both tar- GeForce3 and GeForce4 Ti series. These cards support get graphics cards contain programmable fragment depth textures so generating shadow map k is straight- stages which can be used for per-pixel lighting compu- forward. A texture map of the desired size is created tations. In both cases the number of interpolator inputs with internal format DEPTH_COMPONENT24_ARB and is severely limited to a four-channel (r,g,b,a) primary the scene is rendered from viewpoint lk with all light- colour, a three-channel (r,g,b) secondary colour, and K four- ing, texturing and colour buffers disabled. If available the channel (r,g,b,a) texture values. Since we wish to use all of WGL_ARB_render_texture extension can be used to the texture units for shadowing we are constrained to place render directly to the texture otherwise the image is rendered s d a ik, ik and I in the primary and secondary colours which to the and copied internally to the graphics card permits seven channels in all. using glCopyTexSubImage2D. We limit ourselves to generating diffuse and specular il- Before rendering the scene from the eye’s viewpoint, lumination components for at most three unique lights, and texture k is bound to texture unit k and all texture units are d d d a s s s programmed to clamp to a border depth of 1 0, with linear place (i1,i2,i3,I ) in the primary colour and (i1,i2,i3) in . the secondary colour. If K > 3 we therefore enforce the re- filtering enabled. GL_TEXTURE_COMPARE_MODE_ARB striction that the K lights must be partitioned into three sub- is set to GL_COMPARE_R_TO_TEXTURE with GL_TEXTURE_COMPARE_FUNC_ARB sets G1, G2 and G3 such that the light positions lk are clus- set to tered around centres c1, c2 and c3 and GL_LEQUAL. Coordinate generation is enabled for all four texture coordinates in GL_EYE_LINEAR mode, and lk ≈ ci ∀ k ∈ Gi. the (s,t,r,q) GL_EYE_PLANE values are respectively set to the four rows of the matrix SFlMl where On our target cards K is at most 4 so in practice this re- striction amounts to placing two of the four lights close 0.5 0.0 0.0 0.5 together, which is reasonable for our target application of 0.0 0.5 0.0 0.5 S =  . soft-shadowing which clusters many identical lights close 0.0 0.0 0.5 0.5 0.0 0.0 0.0 1.0 together in any case. Since shadow boundaries have much   higher spatial frequency than either diffuse or specular light- Three general combiner stages are used, and the program is ing variations it is still worthwhile to generate 4 shadows given in figure 1. given only 3 lighting locations. It would be possible at the expense of abandoning support for simple ambient lighting s 4.2. ATI Radeon 8500 to place i4 for a fourth light location in the alpha channel of the primary colour to account for the higher spatial variation We have investigated implementing illumination nodes us- of specular lighting compared with diffuse lighting. ing an ATI Radeon 8500 card which does not support depth

c The Eurographics Association 2002. Isard, Shand and Heirich / Distributed Soft Shadows

Stage

0 0 spare0rgb = texture0rgb × const0rgb + texture1rgb × (1 − const0rgb) 0 0 spare0a = texture2a.1 + texture3a.1

0 1 spare0rgb = spare0rgb × (1 − const1rgb) + spare0a × const1rgb

0 2 spare0rgb = spare0rgb • primaryrgb 0 2 spare1rgb = spare0rgb • secondaryrgb

Final finalrgb = spare0rgb × const0rgb + spare1rgb × (1 − const0rgb) + primaryalpha × const0rgb

On entry const0 = (1,0,0,0) const1 = (0,0,1,0) d d d a s s s primary = (i0,i1,i2,I ) secondary = (i0,i1,i2,0) texturek = (sk,sk,sk,sk)

On exit d d d a final = (i0s0 + i1s1 + i2(s2 + s3) + I , s s s i0s0 + i1s1 + i2(s2 + s3), s s s i0s0 + i1s1 + i2(s2 + s3))

Figure 1: The register combiner program for rendering four shadows on an nVidia GeForce4 Ti graphics card.

textures. Although the card has K = 6 active textures, the els and a software simulation of the compositing operators depth comparison must be performed as part of the frag- is run to generate the final image. Table 2 shows timings ment shader program and so two texture coordinates must measured on our single node test setup. The Sepia architec- be assigned to each shadow map so at most three shadows ture introduces a latency of approximately two frame refresh can be rendered in a single pass. A 16-bit precision depth periods, while the number of frames rendered per second is comparison can be implemented in a two-pass shader pro- approximately that of the slowest node in the pipeline. The gram as opposed to the 24-bit comparison performed by the nVidia driver we used for Linux does not support direct ren- depth texture functionality on the nVidia card. Unfortunately dering to textures, though this is supported by the Windows percentage-closer filtering is not possible with this imple- drivers. We measure the time for the glCopyTexSubIm- mentation and so aliasing artifacts are much more visible. age2D call to be 1.31 ms per computed shadow map. In addition the performance is significantly worse than the GeForce4 Ti 4600 implementation so all results are shown 6. Discussion using the nVidia card. We demonstrate an algorithm which is able to render ap- proximate soft shadows at interactive rates on a cluster of 5. Results commodity computers linked by a Sepia2a compositing net- We tested the algorithm on a set of simple models using an work. The number of lights scales linearly with the num- nVidia GeForce4 Ti 4600 graphics card in a Compaq Evo ber of available nodes and increasing the number of render- D500 1.7 GHz P4 workstation running Redhat Linux 7.2. ing nodes results in a negligible reduction in performance. Figure 3 (see color plates) shows images as they would be For walkthrough applications the new algorithm reduces the rendered at 800 × 600 pixels on a 9-node Sepia2a cluster number of required rendering nodes by a ratio of 1 + ε : 4 using 512 × 512-pixel textures for the shadow maps. As a compared with a naive approach, where ε → 0 as the num- Sepia2a cluster populated with GeForce4 Ti 4600 graphics ber of lights increases. For scenes with changing geometry a cards was not available, the compositing is done in simula- naive approach renders one shadow per node using 2 render- tion. The local rendering code is run exactly as it will be in a ing passes. The new algorithm must perform K +1 rendering full system, then the image is grabbed using glReadPix- passes to render K shadows, so as long as the timing budget

c The Eurographics Association 2002. Isard, Shand and Heirich / Distributed Soft Shadows

dering architecture which supports a “join” operator taking Model\Lights per node 1 2 3 4 network pixels from more than one preceding render node in Balls (no shadows) 0.98 1.13 1.43 1.44 the pipeline without increasing the maximum number of bits Balls (walkthrough) 1.11 1.26 1.55 1.57 transmitted in a network pixel. Balls (moving lights) 3.31 5.65 8.25 10.71 Minor updates to the algorithm would allow rendering Horse (no shadows) 1.17 1.30 1.45 1.60 shadows cast by pseudo-transparent objects using depth- Horse (walkthrough) 1.26 1.40 1.54 1.69 peeling2. Unfortunately the number of rendering nodes used 2 Horse (moving lights) 2.39 3.82 5.66 7.49 by such an approach is O(D ) where D is the number of depth-peeling layers, so this is probably not feasible for D Plant (no shadows) 1.13 1.08 1.18 1.35 much greater than 4. Plant (walkthrough) 1.24 1.21 1.29 1.47 Plant (moving lights) 3.47 5.64 7.98 10.70 References Box (no shadows) 0.54 0.55 0.58 0.62 Box (walkthrough) 0.77 0.79 0.81 0.83 1. F. Crow. Shadow algorithms for computer graphics. In Box (moving lights) 2.04 3.53 5.31 6.98 Proceedings of SIGGRAPH, pages 242–248, 1977. 2. C. Everitt. Order independent transparency. Technical Figure 2: Rendering times in ms for the models in figure 3 report, nVidia, developer.nvidia.com. (see color plates) which are shown rendered with 4 lights per node. Times marked “no shadows” correspond to render- 3. C. Everitt and M.J. Kilgard. Practical and robust ing the scene in full colour with OpenGL lighting enabled. stenciled shadow volumes for hardware-accelerated devel- Times marked “walkthrough” correspond to rendering the rendering. Technical report, nVidia, oper.nvidia.com scene with precomputed shadow maps with the fragment . shader programmed as described in figure 1. Times marked 4. C. Everitt, A. Rege, and C. Cebenoyan. Hardware “moving lights” are as for “walkthrough” but the shadow shadow mapping. Technical report, nVidia, devel- maps are recomputed at each frame. Timings are given for oper.nvidia.com. a single node and the Sepia architecture renders composited 5. A. Heirich and L. Moll. Scalable distributed visualiza- frames at approximately the speed of the slowest node in the tion using off-the-shelf components. In IEEE Parallel pipeline. Note that for “walkthrough” scenes these render- Visualization and Graphics Symposium, pages 55–60, ing times are a small fraction of typical screen refresh rates 1999. and even the slowest “moving lights” rendering times corre- spond to refresh rates above 93 Hz. 6. T. Lokovic and E. Veach. Deep shadows. In Proceed- ings of SIGGRAPH, pages 385–392, 2000. 7. S. Molnar, M. Cox, D. Ellsworth, and H. Fuchs. A sort- ing classification of parallel rendering. IEEE Computer permits at least two shadow maps to be rendered per node Graphics and Applications, 14(4):23–32, 1994. the algorithm still decreases the number of required render nodes by a ratio of 1 + ε : 2, while the ratio of 1 + ε : 4 is 8. S. Molnar, J. Eyles, and J. Poulton. PixelFlow: High- achieved if the timing budget permits 5 shadow maps per speed rendering using pixel composition. Computer illumination node. Graphics, 26(2):231–240, 1992. The main limitation of the method is that it scales badly 9. W.T. Reeves, D.H. Salesin, and R.L. Cook. Rendering with increasing scene complexity as each node must ren- antialiased shadows with depth maps. In Proceedings der the full scene geometry. For walkthrough applications of SIGGRAPH, pages 283–291, 1987. it would be straightforward to perform a sort-first decom- 10. M. Segal, C. Korobkin, R. van Widenfelt, J. Foran, and position of the scene so that each illumination node renders P. Haeberli. Fast shadows and lighting effects using only a subset of the visible geometry. It is well known7 how- . Computer Graphics, 26(2):249–252, ever that sort-last scene decompositions allow better load 1992. balancing of distributed rendering than sort-first methods. 11. G. Stoll, M. Eldridge, D. Patterson, A. Webb, A sort-last Z-compositing approach is feasible using the al- S. Berman, R. Levy, C. Caywood, M. Taveira, S. Hunt, gorithm presented here at the expense of transferring more and P. Hanrahan. Lighting-2: A high-performance dis- data in each network pixel. Illumination nodes would have play subsystem for PC clusters. In Proceedings of SIG- to transmit not only the 63-bit diffuse and specular compo- d s GRAPH, pages 141–148, 2001. nents (pn,pn) but also (r,g,b,z) channels describing a par- tially rendered image which would typically be assigned 48 12. L. Williams. Casting curved shadows on curved sur- bits in total: 8 bits per channel for colour and 24 bits for faces. In Proceedings of SIGGRAPH, pages 270–274, the Z-buffer. Alternatively it would be possible to use a ren- 1978.

c The Eurographics Association 2002. Isard, Shand and Heirich / Distributed Soft Shadows

Figure 3: Various simple models rendered with 32 point light sources. The first three images approximate two area lights with 16 samples each and the bottom right image approximates a single area light with 32 samples.

c The Eurographics Association 2002.