Image-Based Rendering of Surfaces from Volume Data

y y Baoquan Chen Arie Kaufman Qingyu Tang

Abstract keyviews. This is similar to sampling the full plenoptic function [21], resulting in 3D, 4D or even 5D image sprites [12, 20]. These We present an image-based rendering technique to accelerate sur- methods usually need a very dense view sampling (thousands of faces rendering from volume data. We cache the fully volume ren- images), which requires a large amount of memory storage. Dally dered image (called keyview) and use it to generate novel view with- et al. [9] use a delta tree to represent all the keyviews. out ray-casting every pixel. This is achieved by ﬁrst constructing Even though most of these methods were developed for accel- an underlying surface model of the volume and then texture map- erating polygon rendering, their concepts can be applied to accel- ping the keyview onto the geometry. When the novel view moves erating volume rendering. Choi and Shin [5] have applied Dally et slightly away from the keyview, most of the original visible regions al.’s method to render the fully opaque surface of the volume using a in the keyview are still visible in the novel view. Therefore, we only quadtree structure instead. Their method supports only parallel pro- need to cast rays for pixels in the newly visible regions, which usu- jection. Even though tens of thousands of reference images are pre- ally occupy only a small portion of the whole image, resulting in a computed, holes may still appear in the generated image. The near- substantial speedup. We have applied our technique to a virtual est neighbor (zero-order) interpolation has been used for efﬁciency,

colonoscopy system and have obtained an interactive navigation but at the cost of degrading the image quality. Generally, a tradi- 3 through a 512 size patient colon. Our experiments demonstrate an tional IBR method is inappropriate for volume rendering because average of an order of magnitude speedup over that of traditional whenever the transfer function changes, it is required to regener- volume rendering, compromising very little on image quality. ate the whole set of reference images, resulting in a huge amount of computation and storage. Gudmundsson and Randen [13] have proposed to utilize the coherence between neighboring views and to 1 INTRODUCTION incrementally generate the novel view. This incremental approach avoids storing thousands of images. Similar to the above technique, Major efforts have been dedicated to accelerating volume render- it supports only parallel projection and the rendering of the fully ing, in both a software and hardware approach. One hardware ap- opaque surface. In addition, it delivers a degraded image quality proach is to leverage existing hardware, especially 3D texture map- as only a zero-order interpolation is used; similarly, holes may ap- ping hardware, to accelerate volume rendering [3, 8, 29]. However, pear in the generated image. These disadvantages have prevented 3D texture mapping hardware is not scalable and does not support these methods from a broad usage in volume rendering. Yagel et interactive classiﬁcation. Another hardware approach is to utilize al. [30] have extended Gudmundsson and Randen’s technique to special-purpose hardware, such as, VolumePro [24], for volume

3 support ﬂexible volume rendering. Rather than using the incremen- rendering. VolumePro can render a 256 volume at 30 frames per tal method to directly generate the ﬁnal image, they only use it to second. However, the performance of VolumePro is still challenged

3 generate the ”C-buffer,” storing the object-space coordinates of the by large data set. For example, for a 512 data, theoretically only ﬁrst non-empty voxel visible from every pixel. They further use 3-4 frames per second rendering performance can be obtained. Fur- thus generated ”C-buffer” to accelerate the ray-casting operation thermore, the current VolumePro supports only parallel projection. by ”space-leaping” (skipping the empty space). Full rays are still Many applications, for example, the virtual colonoscopy system in required to cast into the data. Currently, space-leaping is consid- biomedical application, require perspective projection. Parallel pro- ered as a standard optimization technique for ray-casting, together jection of a straight tube appears as a ring, which prevents us from with several other techniques, such as early ray termination. The examining the interior walls of the colon. With some software de- technique to be discussed in this paper will be compared against sign, VolumePro could be utilized to support perspective projection, the existing optimized ray-casting. except at great cost of the rendering speed as well as image quality [16]. A typical data size in a virtual conlonoscopy system is Most recently, more research has been conducted to further ac-

3 celerate volume rendering using the image-based approach. Brady

512 ; a straightforward implementation using VolumePro delivers a much lower frame rate than the interaction requirement when per- et al. [2] have proposed a two-phase approach. This method divides spective rendering is supported. the viewing frustum into regions (or slabs) based on the Euclidean The software approach includes early ray termination [19], dis- distance from the eyepoint. The algorithm then ray-casts each re- tance encoding, presence acceleration [1, 6], ray coherence [14], gion separately. The images created from the regions are subse- and shear warping [17]. These methods still cannot meet the inter- quently texture mapped onto polygons (or billboards) and compos- active speed requirement unless they are parallelized on powerful ited together using conventional graphics hardware. The images machines [17, 23]. of some back regions are reused for compositing to speedup the Recently, image-based rendering techniques have been devel- rendering of subsequent frames. A problem with billboards is that oped to model complex scenes using images or sprites instead of when the viewpoint changes, the pre-generated billboard images complex geometries, and to render directly from images. One will be shifted with each other. This leads to two unpleasant situa- can pre-compute a group of keyviews (or reference images)sothat tions: (1) holes may appear at the edges of the billboards or (2) the any novel view can be generated from one (or a subset) of these integrated object may be pulled apart. More recently, Mueller et al. [22] have reported a scheme to enhance the billboards with depth

4-192 EE/CSci Building, 200 Union St. SE, Minneapolis, MN 55455. information. There, a billboard (or a slab image) is subdivided into g Email:fbaoquan @cs.umn.edu a grid of small tiles and a z-value is approximated for each tile to

y Center for Visual Computing and Computer Science Department, State form a coarse scale ”z-buffer.” The z-value represents the average of

University of New York at Stony Brook, Stony Brook, NY 11794-4400. the nearest and farthest z-values of the object. A polymesh for each g Email:fari @cs.sunysb.edu slab image is then generated from the ”z-buffer,” which represents a (a) (b)

Figure 1: Comparison of the images rendered by (a) polygon rendering and (b) volume rendering. spatial approximation of the underlying object surface. Now, when rendering. In addition, the mesh does not require to be constructed the slab images are mapped onto the polymeshes rather than the for every keyview (it does need to be reconstructed whenever the planes, the viewpoint is allowed to move further before holes and transfer function changes). (2) We utilize this view-independent artifacts become visible. Nevertheless, this method does not elim- full geometry to detect newly emerging scenes on the fly and send inate the holes and artifacts completely. While for the translucent new rays to render these scenes. Furthermore, we utilize the mesh objects, these artifacts tend to be disguised by their fuzzy nature; to detect which part of the object needs to be re-rendered to avoid they can show up easily for more opaque objects with clear struc- poor quality of texture mapping. These operations allow us to con- tures. The main source of these artifacts is the way that the poly- trol the quality of the generated image, rather than passively let the meshes are constructed and used. First of all, because the polymesh image quality degrades. (3) We employ an adaptive mesh by us- for each slab stays in the center of the slab, the neighboring meshes ing more polygons for the surface with a higher curvature, while do not attach to each other, which makes it inevitable that holes will less polygons for a relatively flat surface. This is superior to the show up when the novel viewpoint moves to the side. For example, uniform polymesh, which does not adapt to the underlying surface if a box is cut into two slabs, the two polymeshes constructed will topology. An adaptive mesh promises better image quality than a simply be two separate planes in the center of each slab. When uniform mesh if both use the same number of polygons. This is the novel viewpoint moves slightly to the side, the side faces of the very important when we have a limited budget to render polygons, box will be immediately pulled apart. This is immediately visible which is the usual condition. (4) In addition, we have built up con- for a non-transparent box. However, when the box is transparent tinuous Levels-Of-Detail (LOD) meshes of the 3D geometry so that and filled with gel, the artifact may be tolerated longer. Second, we can select a typical level for a practical navigation to fine tune since the meshes are view-dependent, they do not reflect the real the performance and image quality. In general, using the model 3D topology of the scene. For example, when two separate objects with more polygons delivers a higher quality image, yet, lowers the in 3D space overlap on screen, the polymesh generated will be a rendering performance (but there is a limit on the image quality connected mesh. Therefore, when the novel viewpoint moves to gain by increasing the level of the model). a position where we should see the gap between the two objects, We strive to apply our technique to a real navigation system instead, we will still see a connected object. This situation is es- to demonstrate the effectiveness of the technique. We have cho- pecially problematic for a colonic navigation where holes persist. sen the virtual colon navigation system as our major application, Finally, while Mueller et al. have experimented the manipulation even though our technique can be applied to other volume render- of an object by rotating it back and forth around the keyview, no ing systems. Previously, a polygon-based interactive virtual colon continuous navigation has been demonstrated. However, in a con- navigation system has been developed [15], where the colon was tinuous navigation system, the smoothness of the image quality is represented as a polygonal surface to seek graphics hardware sup- desired; the artifacts issue becomes pressingly critical, but more port. Yet, polygon rendering not only removes small features of challenging to solve. the object but also adds some artifacts caused by the sharp edges The goal of this paper is to design a novel method solving all and silhouettes of the polygon surface [28] (See Figure 1a). A di- above problems, except we focus on accelerating Levoy’s surface rect volume rendering technique can meet physicians’ demand by rendering from volumes [18]. We eliminate dense view sampling directly mapping certain ranges of sample values of the original by constructing a 3D model from the volume and utilizing spatial volume data to different colors and opacities (See Figure 1b). How-

coherence during the navigation, very similar to [22]. Here, we ﬁrst ever, direct volume rendering, even rendering only thin surfaces, is 3 render the volume using the conventional volume rendering and very expensive, especially for large size colon data, such as 512 . cache the rendered image as a keyview. For subsequent novel views, As we have discussed earlier, it is a paramount challenge even with we texture-map the keyview image onto the underlying geometry the VolumePro support. Mueller et al’s method appears inappropri- and view it from a novel camera position. Although the high level ate for such an application for the aforementioned reasons. Further philosophy here is similar to Brady et al.’s and Mueller et al.’s meth- efforts have been dedicated to accelerating the volume rendering ods, the differences are distinguishable, which mainly come from using a depth buffer [31] or distance ﬁeld [28], to skip the empty the mesh we use: (1) We use a view independent full geometry, space inside the colon, both with a parallel implementation. The

which faithfully represents the underlying object surface. It is an performance achieved still did not meet the interactive speed. Gen-

512 361 enclosed and integrated mesh, therefore no holes will show up in the erating a frame of a patient’s colon of 512 needs about texture mapping

ray-casting + detecting holes

keyview filling holes (ray casting) transfer function isosurface extraction volume data

surface model novel view

Figure 2: Overview of our algorithm.

512 one second when fully casting 512 rays on an SGI Chal- a parameter factoring the texture magnification. If the calculated lenge using 9 R10000 processors. Our goal is to make this system fidelity value is less than a given threshold, we render it in the spe- interactive on a commodity machine (with only a single processor) cial color as well. Therefore, after the texture mapping, all pixels while maintaining the features of direct volume rendering. having the special color indicate holes. As the final operation, we In the next section (Sec. 2), we give an overview of our al- scan the frame buffer and cast new rays to fill these holes.

gorithm. After that, we discuss issues involved in the algorithm, Errors are introduced in this procedure. Depicted in Figure 3, v

including geometry construction (Sec. 3), texture mapping and black rays of keyview ( 1 ) are used to approximate blue rays in v polygon visibility detection (Sec. 4) and strategies of rendering novel view ( 2 ). Apparently, two rays of two views hitting the keyviews (Sec. 5). Finally, implementations and results are demon- same point traverse through different directions and with different strated in Sec. 6, followed by conclusions in Sec. 7. depths, leading to errors. However, if (1) two viewpoints are close and (2) all rays do not traverse too far, the error caused by this approximation will be invisible. The first condition is usually the case 2 ALGORITHM OVERVIEW during a smooth navigation; the second condition generally holds for Levoy’s surface volume rendering, where rays traverse only a short distance before they reach the full opacity. For more general The central idea of our algorithm is to texture-map the volume ren- volume rendering case, the second condition is not valid anymore, dered image, or keyview, onto an underlying geometry constructed and we could break the volume into slabs as in Mueller et al.’s and from the volume to speedup the generation of novel view. Figure 2 Brady et al.’s methods to ensure the second condition holds within illustrates our algorithm. During the preprocessing, we construct each slab. In this paper, we will only emphasize on surface volume a surface model from the input volume. Intuitively, this surface rendering. model represents the outmost visible surface of the volume determined by the input transfer function — a mapping from a voxel’s density to opacity value. Therefore, the iso-value of the surface model is chosen as the minimum voxel value with non-zero opacity. Once we have the surface geometry of the volume and the volume rendered keyview(s), we can use them to speedup the generation of the novel view(s). The general guideline is that we can texture map the keyview image onto the geometry and project the geometry to the novel view position. A critical issue that we strive to address is to maintain the image quality. When performing the texture mapping, holes may appear. We consider two situations. The first situation is that when the viewpoint changes, the originally occluded or viewing frustum culled parts in the keyview could become visible. For these newly visible parts, no textures are available from the keyview, hence, we have to cast new rays (the red rays in Fig- ure 3). To achieve this, we detect each polygon’s visibility in the keyview when performing the texture mapping and render invisible polygons with a user specified special color (red (or lightly shaded v v on black-and-white printing) in Figure 2). The second situation is 1 2 that the projection area of a visible polygon may increase greatly

because of the view position change, which leads to a great texture v Figure 3: Black rays (dotted) in keyview ( 1 ) are used to approxi-

magniﬁcation, resulting in a fuzzy image. To guarantee the render- v mate blue rays (dashed) in novel view ( 2 ). The red ray (rightmost) ing quality of each visible polygon, we calculate its texture ﬁdelity, is the new ray shot in the novel view.

(a) (b) (c)

128 113 Figure 4: Different levels of an iso-surface of a CT head ( 128 ): (a) high level (original) mesh (361,528 triangles), (b) middle level mesh (168,708 triangles), (c) low level mesh (81,246 triangles).

3 GEOMETRY MODEL we will see from the following, our calculation of the visibility and the texture ﬁdelity of a triangle is independent of the novel view, One issue of constructing a geometry model is its accuracy, on we therefore calculate them only once and store the result together

which the texture mapping quality depends. We have some oppos- with each keyview.

; 2 3

ing considerations. A ﬁne detailed geometry will maintain the ob- In Figure 5, all three vertices 1 and of the shaded triangle are

v M 1

ject topology, especially the silhouette, but is slow to render; while ﬁrst projected to the keyview 1 using its projection matrix to

t ;t t

2 3 a simpliﬁed model enhances the speed of polygon rendering, but obtain their screen coordinates 1 and , which are used as tex-

is less accurate. The conclusion is that we need to construct a ge- ture coordinates. The triangle is then re-projected using the novel M ometry using certain level-of-detail to balance between these issues. view projection matrix 2 during the texture mapping. To ensure Our technique is open to any volumetric surface extraction and sim- projective-correct texture mapping, we use homogeneous coordi- plification method, except that a crack-free method is desired, such nates for texture coordinates. This has been discussed in detail in as methods in [25, 32]. If a crack appears in a polygonal model, it [10]. may end up a hole in the final image. Here, we have extended the method by Zhou et al. [32] to create a crack-free LOD mesh. First, 1 we represent a volume using an implicit tetrahedra subdivision and then fuse the adjacent tetrahedra together. This fusion is based on a user specified error threshold, which defines the maximum devia- 2 3 tion of the resulting iso-point after the fusion. Care is taken in the fusion stage so that the generated tetrahedra representation main- M1 M tains a smooth transition between levels. After that, an iso-surface 2 t is extracted from the tetrahedra approximation, which is automati- 1 1' cally simplified, smoothened and crack-free. Figure 4 shows three different levles of detail meshes (controled by different error thresh- t2 ' ' olds). As we can see, the number of triangles varies greatly (from 2 3 t3 361K to 81K) but the topology is preserved. The middle level mesh is the one that we have used in our implementation in Section 6. It gives us an appealing balance between overall performance and the v1 (keyview) v2 (novel view) generated image quality. Figure 5: Texture mapping. 4 TEXTURE MAPPING NOVEL VIEWS When assigning texture coordinates for each triangle, we have to Texture mapping of a keyview could be achieved using hardware determine whether the keyview is a valid texture source. First, we supported projective texture mapping in OpenGL. However, we test whether the triangle is visible in the keyview. If yes, we fur- discard this approach because its current hardware implementation ther evaluate its texture fidelity and test whether it is greater than a does not test for the visibility of polygons. Occluded polygons in certain threshold. Only succeeding these two tests, will the triangle the keyview are mapped with wrong textures of the front occluding be texture mapped with the keyview image; otherwise, it is drawn polygons when they become visible in the novel view. Instead, we in the special color indicating a potential hole. Notice that most of seek to detect the polygon visibility on our own. On the other hand, the previously invisible triangles remain invisible in the novel view; to perform texture mapping we will have to compute the 2D texture therefore, holes only occupy a small portion of the novel frame. coordinates for each vertex. Since this computation involves pro- In OpenGL, drawing a triangle using a uniform color requires jection of each vertex onto the screen, we can automatically obtain flat shading. Intuitively, we need to switch the shading mode be- the depth value, hence the visibility test for each vertex becomes tween flat shading and texture mapping according to the visibility free. We further propose a scheme to perform visibility test of a tri- of the triangle. This mode switching reduces the efficiency of trian- angle based only on the visibility of the triangle’s three vertices. As gle drawing. One candidate solution is to organize the whole triangle list into a visible and an invisible list. We render these two lists One issue for using the z-buffer is that a vertex is usually pro- successively and switch the rendering mode only once. However, jected to a non-grid point on the keyview screen. Using a z-buffer this is not an appealing approach because whenever the keyview value at the closest grid position does not always give us correct changes, we have to reorganize the lists. Instead, we adopt a sim- visibility because that z value can represent the neighboring trian-

pler approach. Before we load a keyview into the texture memory, gle. Interpolating among neighboring z values is also inappropriate

; 0 we overwrite the texel at 0 (the origin of the image) with the because they can represent disconnected objects. To better solve

special color. For an invisible triangle, we specify all its three ver- this problem, we evaluate the following equation:

; 0

tices with the same texture coordinates 0 . The triangle is thus

jZ Z j < vertex

uniformly ﬁlled with the special color. buffer (1)

Z Z vertex The visibility of a polygon in the keyview falls into four cat- where is the calculated z value of the vertex, buffer is

egories: (1) fully visible; (2) fully occluded or viewing frustum the z buffer value at the closet grid point, and is the specified culled; (3) partially occluded; (4) partially viewing frustum clipped. ”thickness” of the visible surface. As long as Equation 1 is true, the During the texture mapping, a partially occluded triangle is simply vertex is visible. Yet, this evaluation may appear too conservative drawn in the special color without using the texture corresponding for certain situations. For example, when a visible plane is almost to the visible part, similar to Chen et al. [4]. This is because achiev- perpendicular to the viewing plane, two distant points on the plane ing that would require visibility clipping on the triangle — subdi- projecting to a similar screen location may have a great z difference. viding the triangle into visible and invisible children triangles — as The above evaluation may mistakenly detect one vertex as invisible. described by Debevec et al. [11]. Performing this visibility clip- We argue that in this situation planes project to a very small area, ping on triangles is not only expensive, but also generates too many such that even we falsely take the area as a hole the extra ray-casting small triangles and has to be performed for every keyview. Nev- effort is little. ertheless, if a triangle is partially clipped by the viewing frustum, To detect whether a triangle is clipped by the viewing frustum we desire to make use of the partially visible texture. This becomes in the keyview, we check its 2D vertex projections on the keyview very important for virtual colon navigation, where the camera is plane. If any of them is beyond the clipping window, it indicates very close to the colon wall, such that these clipped triangles are that the triangle is partially clipped. usually the closest to the camera, having large projections on the When a triangle is detected as visible in the keyview, we need to screen. Merely invalidating texture mapping for the whole triangle further evaluate its texture fidelity. Here, our metric is the texture leads to big holes. Our solution is to overwrite the border of the tex- magnification, or scaling factor as in [7]. While we could use the ture image as the special color and specify the texture mapping pa- same scheme as in [7] to compute the scaling factor for each trian- rameter as CLAMP in the calling function glTexParameter().The gle, instead, we use a static and faster, but more conservative ap- viewing frustum clipped part of a triangle has texture coordinates proximation. We calculate the angle between the keyview direction outside the range [0,1], which is clamped to the border color (now and the triangle normal, which can be approximated by a simple dot special color); while the non-clipped part is still correctly texture product. The larger the angle, the lower the texture fidelity. mapped. To accurately detect the visibility, we need to scan-convert the whole triangle, and for each pixel inside, we determine its visibil- 5 RENDERING KEYVIEWS ity, the same as z-buffer testing. This is obviously too expensive to perform. Instead, we only detect its three vertices. Only when all Keyviews are fully volume rendered, either on-the-fly or during pre- three vertices are visible is the triangle considered visible. See Fig- processing. Because keyviews are used to generate novel views, ure 6, where triangles 3, 4, and 5 both have a vertex occluded and we can not support a view dependent specular lighting model in are treated as invisible. But this detection fails for triangle 2, where keyview rendering. Instead, we employ a purely diffuse lighting it is partially occluded but none of its vertices is, such that it is erro- model. This is a common choice for most of the IBR methods. Fur- neously determined as visible. Fortunately, our experiments show ther, during the texture mapping, we use DECAL mode in calling that this is very rare in a real data set, with only a few occurrences the glTexEnvf() function to avoid superimposing any lighting effect. during a long navigation. To render the keyview on-the-fly, one issue is when to render it. When the novel view moves away from the keyview, more invisible polygons get exposed, resulting in larger holes; therefore more efforts is required for ray-casting. For every novel view, we evaluate 3 the hole area percentage; that is, the ratio of the number of new 4 rays to the area of object occupancy on screen. When the hole area percentage of a novel view is beyond a given threshold, we decide 1 2 to render a new keyview. Alternatively, we can pre-render a group of keyviews during the preprocessing so that any novel view can be generated from one (or 5 a subset) of them. Thus the system never has to ray-cast the whole volume during the navigation. In contrast to previous image-based rendering methods [12, 20], the number of keyviews can be greatly reduced, since we take advantage of a full underlying 3D geometry Figure 6: Visibility detection. (in Lumigraph, only a coarse 3D model is used). The major issue For each keyview, we generate and store its z-buffer for visibility in pre-rendering is where to place keyview cameras. It remains an detection. A vertex is projected onto the keyview projection plane open problem to determine for a given 3D model, the minimal set and its depth is compared with the corresponding z value in the z- of keyframe images, such that for all novel viewpoints, a correct buffer. The z-buffer for a keyview can be generated in two ways: image can be obtained by combining the keyframe images [27]. as a by-product of ray casting by detecting the iso-point when sam- This statement also applies to volume representation. However, full pling along the ray, or by rendering the underlying geometry at the coverage of the model by keyviews is not our goal, since we can keyview camera position and reading out the z-buffer. We adopt the detect holes and fill them by casting new rays on the fly. latter approach since the depth buffer can be utilized to accelerate There are different strategies for placing keyview cameras, de- ray casting by quickly skipping empty space (space leaping). pending on the navigation style. Figure 7 illustrates the keyview them during the initial navigation, or automatically assign keyviews β based on the curvature of the center line. The first approach is very α tedious for the user. We have adopted the second approach. We place more cameras where the curvature of the center line is high. In addition, we limit the maximum distance between two neighbor- c21 c ing cameras to guarantee better coverage for a long straight path. 22 One issue is that the provided center line is defined on the grid (α, β) point, so it has too much local noise. It is inaccurate to calculate a curvature from the local values. Our solution is to first apply a low pass filter to smoothen the curve and then calculate the curvature c11 c from the local points. For low-pass filtering, we convolve the cen- 12 ter line twice with a box filter of width 3. Figure 8 shows the result of camera placement. Because of the 2D projections, it does do not reflect the overall curvature in 3D.

6 IMPLEMENTATIONS AND RESULTS

We have implemented our framework on an SGI Challenge Figure 7: Distribution of keyviews around the object. equipped with an Inﬁnite Reality engine, 3GB memory, and 16 R10000 processors. However, in our experiments, only one R10000

processor is used. We have applied our algorithm to different data

128 113

sets: a CT head (128 ), a CT scan of a plastic pipe

512 107 512 512 361 (512 ), and a patient’s colon ( ). The plastic pipe has a radius of 20mm. To simulate colonic polyps, three small rubber objects of size 7mm, 5mm, and 3mm are attached to the inner surface of the pipe. We have experimented with accelerating two kinds of volume navigations: examining the CT head from the outside, and ﬂying

through the pipe/colon. Renderings on a CT head create images

256 512 512 of 256 resolution, while resolution images are created for pipe and colon data sets. The ray-casting method we are to compare against is the optimized version as in [28]. The same ray-casting routine is used in our hole-filling procedure to guarantee a fair comparison. We have used perspective projection for all the rendering. Our results have shown that our method can deliver high speedup while maintaining image quality. Because our image-based volume rendering is divided into two steps, polygon rendering and ray-casting, the rendering speed has to do with two facts: the complexity of the polygon model and the transfer function ramp, which determines how far a ray traverses the data before it reaches full Figure 8: Placing keyview cameras along the skeleton. The line opacity. We have used a moderately simplified surface model as is the smoothened skeleton line, and the dots represent the camera in Figure 4b for the CT head data (Figure 9); but a non-simplified locations. surface model for both colon (Figure 10) and pipe (Figure 11) data. We used a linear ramp between 45 and 65 as the transfer function for the CT head, which means that the opacity is zero below 45, camera configuration when examining an object from the out- changes from zero to one (indicating full opacity) from 45 to 65 side. Here we place a sphere around the volume, and the cam- and remains one beyond 65 (maximum voxel value 255). For the eras evenly on the sphere with cameras all pointing at the cen- plastic pipe data, we used a linear ramp between 30 and 80, while a ter of the object. Given the novel view (a camera position and linear ramp between 30 and 100 is used for the colon data. The iso-

view direction) shown in Figure 7, the view vector intersects the value of the surface is set to 45, 30 and 30 respectively for the three

;

sphere at point . There are several ways of utilizing the sur- data sets for constructing iso-surfaces. Figure 9 – 11 shows that

c ;c ;c c

12 21 22

rounding keyviews, 11 or to generate the novel view when rendering a novel view we only need to cast rays for a very

; . One option is to render the object multiple times using each small portion of the image. For all these data, we have obtained an keyview in turn and blending the results. The blending method average of a magnitude of speedup over the optimized volume ren- not only creates too fuzzy an image, but also is very expensive to dering (all the timings listed in the captions are average timings). perform. We determine, instead, to render only once by choos- In the mean time, the filled pixels match well with the surrounding ing the closest keyview for each polygon. We determine the clos- texture mapped pixels. Figure 9e justifies this by showing an inten- est keyview for each triangle by comparing the angle between the sity scaled-up difference image between a volume rendered novel polygon normal and the view vector of each surrounding keyview. view and our image-base rendered view. The calculated Mean In our implementation, we only test against the four neighboring Square Errors (MSE) of the image is 0.012. keyviews for efficiency. We have successfully applied our algorithm to the virtual The virtual colon fly-through has a different navigation style. A colonoscopy system. Our experiments were conducted on top of the critical issue is where to place keyviews so that we can most ap- original polygon-based prototype system. In virtual colonoscopy, propriately capture the entire colon. There are different ways to image fidelity is considered a top priority. Therefore, the underlying specify keyviews along the center line: either interactively specify geometry has not been simplified. Since both view-frustum culling and occlusion culling, by taking advantage of the twisted colon fea- limit in speed benefit when increasing the number of keyviews. For ture, have been applied to the surface model during the rendering, 26 keyviews, the average percentage of new rays to the whole im- the model can be rendered very efficiently. The navigation is per- age is 1.0%. A further increase in the number of keyviews does not formed in the same style as in the previous surface colonoscopy improve the performance significantly. system [15]. Without clicking the mouse, the camera will automatically fly through the colon along the center line. A mouse clicking will impose a force to influence the moving speed and direction of 7 CONCLUSIONS AND FUTURE WORKS the camera. The speedup obtained from our algorithm is substantial. Figure 12 illustrates the timing of an interactive pipe navigation We have presented an image-based rendering framework for accel- when keyviews are rendered on the fly. The hole area percentage is erating surface volume rendering. We have provided alternative im- set to 1.5%. 32 out of 467 frames are fully volume rendered and plementation designs and applied them to different volume data sets used as keyviews, yielding a keyview update ratio of 7.4%. The and different types of navigation. Its successful application to the average rendering time of the keyviews is 1.9 seconds per frame, virtual colonoscopy system has clearly proven the effectiveness of which includes the time for calculating the visibility map. For clar- our approach. Our experiments have demonstrated an order of mag- ity, the keyview rendering time is clamped on the graph. The av- nitude speedup over the existing optimized volume rendering while erage rendering time for novel views is 0.17 seconds, indicating delivering visually indistinguishable image quality. The rendering more than an order of magnitude speedup. The keyview update ra- speed and the image quality can be tuned by employing different tio depends on the user’s navigation path and the underlying object levels of detail of the underlying surface geometry. Even though structure. If the user makes a sudden turn, or more small features we have done our experiments on workstations, a modern PC could of the object emerge, it will demand the keyview be updated more be sufficient for delivering similar performance using our method. frequently. Figure 12 also shows the timing of the two rendering Today’s PC equipped with the commodity graphics card can meet steps: texture mapping and filling holes. The texture mapping time our computation requirement. is dependent on the complexity of the underlying geometry within When rendering keyview on-the-fly, our most important future the current view frustum, with an average of 0.08 seconds, which work is to address the variable latency caused by rendering the includes the frame-buffer loading time of 0.02 seconds; time for keyframes. One on-going work is to predict where the next keyview filling holes is also view dependent, with an average of 0.11 sec- position will be and to amortize the keyview rendering among the onds. The total time of each frame is the sum of two components. successive novel view rendering. Another future experiment is to For the keyview, the texture mapping time component represents cache the previous keyviews for reuse. This caching will especially the total time of the polygon rendering (used for accelerating ray- benefit the situation when the user moves back and forth to examine casting) and calculation of the triangle visibility map. The average a particular part of the object. To alleviate the time increase when time for calculating a visibility map is 0.18 seconds per frame. This the camera turns toward the colon wall, one candidate solution is to justifies the benefits of pre-calculating the visibility. generate multiple keyviews at each selected keyview position. Be- sides generating keyviews pointing along the center line, we also A delay in keyframe rendering causes jerky navigation. There- generate keyviews pointing to the wall; for example, four views fore, we pre-render keyviews and calculate their visibility maps in looking left, right, up and down. We are currently experimenting advance. The keyviews are positioned according to the method de- with this approach. scribed in Section 5. Figure 13 illustrates the rendering time of an In the current implementation, we have emphasized on applying interactive pipe navigation using 100 pre-rendered keyviews. The our technique to accelerate volume rendering with narrow transfer rendering is smoother than rendering a keyview on-the-fly. Nev- function ramp. As has been briefly mentioned in Section 2, when ertheless, the rendering time depends on the navigation path. Be- the transfer function has a wider ramp (more translucent volume), tween frame 220 and 260, the rendering time increases abruptly; we should cut the volume into slabs parallel to the viewing plane this is because the user has made a sudden maneuver in turning the and then composite the slab images. An immediate issue to be ad- camera to the colon wall. Since currently all keyviews are rendered dressed is how to efficiently perform this view dependent cutting with the camera pointing along the center line, a view facing the on a surface model. We would also like to know: (1) how thick wall benefits little from the keyview, resulting in big holes to fill. should the slabs be so that we can have a balance on the perfor- After the user changes the camera back to following the center line mance and the image quality, and (2) do we always have to render direction, the rendering time drops again. Another time increase the whole slabs or can we still perform early ray termination, and around frame 390 is caused by a similar camera turn. Similar ex- how. Finally, a general issue worth addressing is to design more periments have been done on a patient’s colon data set. Because sophisticated error metrics for the on-the-fly evaluation. of the complex colon wall, we have used 200 keyframes to capture One limitation of our method is that whenever the transfer func- most of the inside structure. The rendering time is captured in Fig- tion changes, we have to reconstruct the mesh. This may become ure 14. The average rendering time of our method is 0.33 seconds a bottleneck of our technique. It becomes critical to design a more per frame, while the average rendering time for volume rendering efficient surface construction method. Another issue related to the is 3.6 seconds. surface model is its silhouette, which is critical to the image qual- We also evaluated how rendering benefits from increasing the ity [26]. We are currently investigating techniques to improve the number of keyframes for the same object. The rendering time of a silhouette of the model while conducting a more aggressive simpli- CT head navigation with a pre-defined path using a different num- fication for the inside. ber of pre-rendered keyviews (6, 14 and 26) is plotted in Figure 15. Keyviews are placed according to Section 5. For a 6 keyview configuration, two keyviews are placed on the north and south poles of 8 ACKNOWLEDGMENTS the sphere, and the remaining 4 are evenly distributed on a 0 de- gree latitude. For a 14 keyview configuration, we add another two This work has been supported by ONR grant N00014011034, NIH latitudes, and again another two for 26 keyviews. The navigation grant #CA82402, and Viatronix Inc. The first author would also rotates the CT head around x, y and z axes consecutively. The aver- like to acknowledge Grant-in-Aid of Research, Artistry and Schol- age rendering time for 6, 14, and 26 keyviews is 0.16, 0.11 and 0.09 arship from the Office of the Vice President for Research and Dean seconds, respectively. When the number of keyviews increases, the of the Graduate School of the University of Minnesota. We wish rendering becomes not only faster, but also smoother. There is a to thank Ming Wan for providing us the ray-casting codes and Huamin Qu for his valuable help on the implementation of the vir- [15] L. Hong, S. Muraki, A. Kaufman, D. Bartz, and T. He. Virtual tual colonoscopy system. Thanks to Dongrong Xu for his evalua- voyage: Interactive navigation in the human colon. Computer tion work and Klaus Mueller and Andreas Baerentzen for their re- Graphics Proceedings, pages 27–34, Aug. 1997. view and comments. The pipe and patient data sets were provided by Stony Brook University Hospital. [16] A. Kaufman, F. Dachille, B. Chen, I. Bitter, K. Kreeger, N. Zhang, and Q. Tang. Real-time volume rendering. Inter- national Journal of Imaging Systems and Technology, 2000. References [17] P. Lacroute and M. Levoy. Fast volume rendering using a shear-warp factorization of the viewing transform. Computer [1] R. Avila, L. Sobierajski, and A. Kaufman. Towards a compre- Graphics Proceedings, pages 451–457, July 1994. hensive volume visualization system. Proc. of IEEE Visual- ization’92, pages 13–20, Oct. 1992. [18] M. Levoy. Display of surfaces from volume data. IEEE Com- puter Graphics & Applications, 8(5):29–37, May 1988. [2] M. Brady, K. Jung, H. Nguyen, and T. Nguyen. Two-Phase Perspective Ray Casting for Interactive Volume Navigation. [19] M. Levoy. Efficient ray tracing of volume data. ACM Trans- In Proceedings of Visualization ’97, pages 183–189, Oct. actions on Graphics, 9(3):245–261, July 1990. 1997. [20] M. Levoy and P. Hanrahan. Light field rendering. In Com- puter Graphics Proceedings, pages 31–42, 1996. [3] B. Cabral, N. Cam, and J. Foran. Accelerated volume rendering and tomographic reconstruction using texture mapping [21] L. McMillan and G. Bishop. Plenoptic modeling: An image- hardware. In 1994 Symposium on Volume Visualization, pages based rendering system. Computer Graphics Proceedings, 91–98, Oct. 1994. pages 39–46, Aug. 1995. [4] B. Chen, J. E. Swan, E. Kuo, and A. Kaufman. LOD-Sprite [22] K. Mueller, N. Shareef, J. Huang, and R. Crawfis. Ibr-assisted technique for accelerated terrain rendering. Proc. of IEEE volume rendering. In Late Breaking Hot Topics of Visualiza- Visualization’99, pages 291–298, Oct. 1999. tion’99, pages 5–9, Oct. 1999.

[5] J. J. Choi and Y. G. Shin. Efficient image-based rendering of [23] S. Parker, P. Shirley, Y. Livnat, C. Hansen, and P. Sloan. In- volume data. In Proceedings of Pacific Graphics’98, pages teractive ray tracing for isosurface rendering. In Proceedings 70–78, Oct. 1998. of Visualization ’98, pages 233–238, Oct. 1998. [24] H. Pfister, J. Hardenbergh, J. Knittel, H. Lauer, and L. Seiler. [6] D. Cohen and Z. Shefer. Proximity clouds - an acceleration The VolumePro real-time ray-casting system. Computer technique for 3D grid traversal. Technical Report FC 93-01, Graphics Proceedings, pages 251–260, 1999. Department of Mathematics and Computer Science, Ben Gu- rion University of the Negev, Feb. 1993. [25] T. Roxborough and G. M. Nielson. Tetrahedron based, least squares, progressive volume models with application to free- [7] D. Cohen-Or, Y. Mann, and S. Fleishman. Deep compression hand ultrasound data. Proc. of IEEE Visualization 2000, pages for streaming texture intensive animations. Computer Graph- 93–100, Oct. 2000. ics, 33(Annual Conference Series):261–268, 1999. [26] P. V. Sander, X. Gu, S. J. Gortler, H. Hoppe, and J. Sny- [8] F. Dachille, K. Kreeger, B. Chen, I. Bitter, and A. Kauf- der. Silhouette clipping. In K. Akeley, editor, Siggraph 2000, man. High-quality volume rendering using texture mapping Computer Graphics Proceedings, Annual Conference Series, hardware. SIGGRAPH/Eurographics Workshop on Graphics pages 327–334. ACM Press / ACM SIGGRAPH / Addison Hardware Proceedings, pages 69–76, Aug. 1998. Wesley Longman, 2000. [27] S. Teller and D. Dobkin. CRC Handbook of Discrete and [9] W. J. Dally, L. McMillan, G. Bishop, and H. Fuchs. The delta Computational Geometry, chapter Computer Graphics. CRC tree: An object-centered approach to image-based rendering. Press, 1997. Technical memo 1604, MIT AI Lab, May 1996. [28] M. Wan, Q. Tang, A. Kaufman, Z. Liang, and M. Wax. Vol- [10] L. Darsa, B. C. Silva, and A. Varshney. Navigating static en- ume rendering based interactive navigation within the human vironments using image-space simplification and morphing. colon. Proc. of Visualization’99, pages 397–400, Oct. 1999. 1997 Symposium on Interactive 3D Graphics, pages 25–34, April 1997. [29] R. Westermann and T. Ertl. Efficiently using graphics hardware in volume rendering applications. In SIGGRAPH Con- [11] P. E. Debevec, Y. Yu, and G. Borshukov. Efficient view- ference proceedings, pages 169–178, July 1998. dependent image-based rendering with projective texture- mapping. Rendering Techniques’98, pages 105–116, 1998. [30] R. Yagel and Z. Shi. Accelerating volume animation by space- leaping. In Proceedings of Visualization ’93, pages 62–69, [12] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen. Oct. 1993. The lumigraph. Computer Graphics Proceedings, pages 43– [31] S. You, L. Hong, M. Wan, K. Junyaprasert, A. Kaufman, 54, 1996. S. Muraki, Y. Zhou, M. Wax, and Z. Liang. Interactive volume rendering for virtual colonoscopy. Proc. of IEEE Visual- [13] B. Gudmundsson and M. Randen. Incremental generation of ization’97, pages 433–346, Nov. 1997. projections of ct-volumes. In the first conference on biomedical computing, pages 27–34, May 1990. [32] Y. Zhou, B. Chen, and A. Kaufman. Multiresolution tetrahe- dral framework for visualizing regular volume data. Proc. of [14] T. He and A. Kaufman. Fast stereo volume rendering. Proc. IEEE Visualization 97, pages 135–142, Oct. 1997. of IEEE Visualization’96, pages 49–56, Oct. 1996. (a) (b)

256 Figure 9: Two views during an interactive CT head navigation (image resolution: 256 ): (a) fully volume rendered keyview, (b) fully volume rendered novel view (rotated 20 degrees, 3.5 secs), (c) the same novel view after texture mapping the image (a) onto the surface model in Figure 4b with unﬁlled holes (colored red – see color in color plate, 0.07 secs), (d) after ﬁlling the holes in (c) (c+d: 0.16 secs), (e) intensity-scaled-up difference image between (b) and (d) (scale factor: 5, MSE: 0.012).

(a) (b) (c)

512 Figure 10: Two views during an interactive colon navigation (image resolution: 512 ): (a) fully volume rendered keyview (3.6 secs), (b) texture mapped novel view with unﬁlled holes (colored green — lightly shaded on black-and-white printing, 0.09 secs), (c) after ﬁlling the holes in (b) (b+c: 0.3 secs).

(a) (b) (c)

512 Figure 11: Two views during an interactive pipe navigation (image resolution: 512 ): (a) fully volume rendered keyview (1.9 secs), (b) texture mapped novel view with unﬁlled holes (colored green — lightly shaded on black-and-white printing, 0.08 seconds), (c) after ﬁlling the holes in (b) (b+c: 0.17 secs).

0.6 0.6 texture mapping ray casting 0.5 0.5 total time

0.4 0.4

0.3 0.3 time (sec) time (sec)

0.2 0.2

0.1 0.1

0 0 0 100 200 300 400 500 0 50 100 150 200 250 300 350 400 450 frame frame Figure 12: The rendering time of an interactive pipe navigation Figure 13: The rendering time of an interactive pipe navigation using on-the-ﬂy keyview rendering (also in color plate). using 100 pre-rendered keyviews.

0.4

1.2 6 keyviews 14 keyviews 26 keyviews 1 0.3

0.8

0.2 0.6 time (sec) time (sec)

0.4 0.1

0.2

0 0 0 200 400 600 800 1000 1200 0 100 200 300 400 frame frame Figure 14: The rendering time of an interactive colon navigation Figure 15: The rendering time of an interactive CT head naviga- using 200 pre-rendered keyviews. tion using different number of pre-rendered keyviews (also in color plate).