TECHNICAL UNIVERSITY DRESDEN

FACULTY OF COMPUTER SCIENCE INSTITUTEOF SOFTWARE AND MULTIMEDIA TECHNOLOGY CHAIROF COMPUTER GRAPHICSAND VISUALIZATION PROF.DR.STEFAN GUMHOLD

Diploma thesis

for the acquisition of the academic degree Diplom-Informatiker

Filtering reflection properties of rough surfaces

Andreas Ecke (Born 11. November 1987 in Ilmenau)

Supervisor: Prof. Dr. rer. nat. Stefan Gumhold Dipl. Medien-Inf. Andreas Stahl

Dresden, September 27, 2012

Task

Filtering of color textures with mipmaps is important to reduce artifacts when rendering a surface at different resolutions. However, this approach does not work for rough surfaces with varying reflection properties. For filtering of reflection properties, solutions like BRDF hierarchies and BTF filters are available, but have several drawbacks. For geometry filtering commonly used techniques include geometry hierarchies and the rendering of displacement maps by ray-casting. The objective of this work is to design a filtering approach for the reflection properties of rough surfaces that can be used to realistically render a surface at different resolutions. This approach should combine BTFs with ray-casting based displacement mapping to allow the rendering of the surface in real-time. Subgoals: • Literature research on BTFs, filtering and displacement mapping • Procedure for synthesis of a BTF from a surface description • Design of a BTF filter with image correlation • Procedure for real-time rendering of a surface with displacement mapping • Investigation of properties of the implementation like handling the empty border of the BTF images, selection of the mipmap level, and application on curved surfaces • Evaluation of the approach concerning correctness and time requirements • Optional: BTF compression, illumination by sky maps

Statement of authorship

I hereby certify that the diploma thesis I submitted today to the examination board of the faculty of computer science with the title:

Filtering reflection properties of rough surfaces has been composed solely by myself and that I did not use any sources and aids other than those stated, with quotations duly marked as such. Dresden, September 27, 2012

Andreas Ecke

Abstract Filtering of color textures with mipmaps is important to reduce artifacts when rendering a surface at different resolutions. However, this approach does not work for rough surfaces with varying reflection properties. For filtering of reflection properties, solutions like BRDF hierarchies and BTF filters are available, but have several drawbacks. For geometry filtering commonly used techniques include geometry hierarchies and the rendering of displacement maps by ray-casting. In this thesis, a filtering approach is designed for the reflection properties of rough surfaces, that can be used to realistically render a surface at different resolutions. This approach combines BTFs with ray-casting based displacement mapping to allow high-quality rendering of the surface for any resolution in real-time.

1

Contents

Nomenclature 3

1 Introduction 5 1.1 Structure of this thesis ...... 5

2 Preliminaries 7 2.1 Rough Surfaces ...... 7 2.1.1 Per-vertex displacement mapping ...... 7 2.1.2 Per-pixel displacement mapping ...... 8 2.1.3 Curved surfaces ...... 10 2.2 Varying reflectance ...... 10 2.2.1 BRDFs ...... 11 2.2.2 Models ...... 11 2.3 Filtering ...... 14 2.3.1 Mipmapping ...... 14 2.3.2 Transition between bump rendering algorithms ...... 15 2.3.3 BRDF mixture models ...... 16 2.4 Bidirectional texture functions ...... 16 2.4.1 LOD for BTFs ...... 17 2.4.2 Compression and rendering ...... 18

3 BTF generation 19 3.1 Surface rendering ...... 20 3.1.1 Lighting models ...... 20 3.1.2 Per-pixel displacement mapping ...... 21 3.2 Image rectification ...... 24 3.2.1 Transformation ...... 24 3.2.2 Projection matrix ...... 24 3.3 Filtering & Compression ...... 26 3.3.1 Laplacian pyramid ...... 26 3.3.2 Principal component analysis ...... 27 3.3.3 Texture packing ...... 28 3.4 Handling of empty border pixels ...... 28

4 Rendering 31 4.1 BTF rendering ...... 31 4.1.1 Projection & Directions ...... 31 4.1.2 Decompression ...... 32 4.1.3 Level of Detail ...... 32 4.1.4 Interpolation ...... 33 4.2 Transition between the BTF and relief renderer ...... 34 4.3 Curved surfaces ...... 35 4.3.1 Using BTFs with curved surfaces ...... 36 2

5 Evaluation 39 5.1 Performance ...... 39 5.1.1 Preprocessing ...... 39 5.1.2 Rendering ...... 40 5.2 Level of detail ...... 41 5.3 BTF compression ...... 43 5.3.1 Border pixels ...... 44 5.3.2 Mean encoding ...... 45 5.4 Interpolation ...... 46 5.5 Curved surfaces ...... 48

6 Conclusion 51

Bibliography 53

List of Figures 55

List of Tables 57 3

Nomenclature

(n,t,b) or TBN tangent space consisting of normal vector n, tangent t and bi-tangent b

(s,t) and (u,v) texture coordinates

µk mean vector resulting from performing PCA on the kth Gaussian pyramid levels

ω = (θ,φ) direction consisting of inclination θ and polar angle φ

ωi = (θi,φi) direction of incident irradiance

ωr = (θr,φr) direction of emitted radiance

ρ of the surface

σ surface roughness for the Oren-Nayar reflection model depth(u,v) value stored in the depth map for texture coordinates (u,v)

E0 light intensity

F0 reflectance of a surface at normal incidence fr(ωr,ωi) general BRDF defined for incident direction ωi and exitant direction ωr fr(θr,θi;φr − φi) isotropic BRDF

ωi,ωr Gk kth level of the Gaussian pyramid for light direction ωi and view direction ωr h half vector between l and v ka, kd and ks ambient, diffuse and specular color of a surface l light direction

ωi,ωr Lk kth level of the Laplacian pyramid for light direction ωi and view direction ωr

Lr resulting color by evaluating the reflection model lod level of detail parameter m surface roughness for the Cook-Torrance reflection model n normal vector p = (x,y) a point on the surface s shininess exponent for the Blinn-Phong reflection model v view direction 4 5

1 Introduction

Rendering surfaces is one of the most fundamental tasks in , as objects are generally represented by their surfaces only. Surfaces of natural objects can be highly complex, featuring roughness, a certain structure and variable reflection properties. Rendering such surfaces is an equally complex task. Much research has been done to find methods to render them as realistic and as fast as possible. An example of such a complex surface is a planet in a space simulation. The planet surface has different elevations, like high mountain ranges and flat oceans, and highly varying reflection properties such as specular water, diffuse forests and glistening, snowy mountain tops. This high complexity also holds for surfaces at other scales, such as house fronts with their reflective windows, a glossy car paint with dirt splashes or printed circuit boards. By combining several reflection models and using techniques such as displacement mapping to add geometric detail, these surfaces can be rendered fairly well. However, another important aspect of realistic rendering is filtering of the surface at low resolutions, i.e. when the surface is far away, and this is where these basic methods fail: They produce heavy aliasing artifacts or change the appearance of the filtered surface incorrectly. This is because standard filtering of the base textures is incorrect for general surfaces. This can be seen for our planet example: If the surface color of this planet is described by a simple color texture, filtering this texture would yield a mixture of the blue water, green forests, a white mountain tops. However, in reality, the blue water and green forest may not be visible when looking from shallow angles, as the are partially or fully occluded by the mountains. Then, the filtered textures should not contain these colors for that particular view direction. Hence, filtering the color texture using the usual methods produces wrong results. Therefore, this thesis main concern is the question of how to filter complex surfaces such that highly realistic and correct real-time rendering is possible at arbitrary resolutions. Given the importance of surface rendering as one of the most fundamental tasks and its huge impact on perceived image quality, a general method that solves this problem will be highly useful in various areas of computer graphics. The proposed solution consists of constructing a bidirectional texture function (BTF), which consists of images of the surface for many different view and light directions, from the surface description. Using such a BTF, each of these images of the BTF can be filtered separately using standard methods. While the constructed BTF solves the filtering problem this way, it introduces further difficulties: BTFs are generally too large for direct rendering on the GPU, and because of their small spatial resolution they are inappropriate when rendering the surface for large resolutions, i.e. when the surface is very close.

1.1 Structure of this thesis

In Chapter 2, we introduce the basic concepts and methods needed for the rendering of realistic surfaces: reflection models, methods to add geometric detail such as displacement mapping and, of cause, standard approaches to filtering. We also discuss related work and point out the issues, and how a solution ultimately leads to a BTF-based filtering approach. The chapter will conclude with an introduction of BTFs and how filtering them can be achieved. The next two chapters explain our approach to surface rendering. In particular, Chapter 3 deals with the 6 1. INTRODUCTION preprocessing steps needed such as the construction of the BTF from the surface description, its filtering and compression, while Chapter 4 explains how the data from the preprocessing stage is finally used in the real-time renderer. Following this, in Chapter 5 we analyze a prototypic implementation and evaluate it in terms of quality, run time and memory requirements. Additionally, the influence of many parameters on the quality and run time is investigated. Finally, Chapter 6 concludes the thesis, and gives an overview of future work. 7

2 Preliminaries

Since surface rendering is such a fundamental task in computer graphics, a great deal of research has been done in this field, with the goal to render surfaces as realistic and as fast as possible. In this chapter, we will present the foundations and some of the research related to the problem of filtering reflection properties of rough surfaces. The chapter is divided into four sections. The first three sections describe the core parts of the problem, i.e. the surface is rough, may have varying reflection properties, and must be filtered for lower resolutions. The last section introduces bidirectional texture functions (BTF) which form the foundation of the approach introduced in this thesis.

2.1 Rough Surfaces

Many objects in the real world have a rough surface, consisting of fine details that, while not affecting the objects general shape (or macrostructure), are still visible to the human eye. A typical example for such geometric detail, also called the mesostructure, is a brick wall where the bricks typically protrude from the mortar, that holds them together. Together with the microstructure of the mortar and the bricks, which describes how light interacts with these materials, the mesostructure has a huge effect on the overall appearance. In the brick wall example, much of the mortar may be occluded by the bricks when looking at the wall from grazing angles, changing the appearance quite a bit. The mesostructure is typically not explicitly modeled by polygonal meshes like the macrostructure, since this would require a large number of additional polygons and severely reduce the performance of real-time rendering algorithms. Instead, the mesostructure is commonly given as a height map, a texture describing the offset of the surface to its coarse macrostructure model. Combining the macrostructure with a height map has two implications: First, each surface point is displaced by the corresponding value in the height map, and second, the normal at the surface point changes according to the derivative of the height values. An early approach which only changes the normals without actually displacing the surface points called bump mapping was introduced by Blinn in [Bli78]. Given the height map h and texture coordinates u and v, the mesostructure normal n0 can be computed from tangent space (n,t,b) with normal n, tangent t and bi-tangent b as

dh dh n0 = n +t + b . (2.1) du dv However, this algorithm does not allow for self-occlusion and self-shadowing of the surface by the mesostructure. Approaches which also displace the surface points are called displacement mapping. These approaches can be divided into per-vertex displacement mapping and per-pixel displacement mapping. A general overview of these methods can be found in [SKU08].

2.1.1 Per-vertex displacement mapping

Per-vertex displacement mapping, first described by Cook in [Coo84], requires a highly tessellated macrosurface, which can either be generated by a geometry or on the CPU. Then, all vertices of this triangle mesh are displaced by the corresponding height in the direction of the surface normal as seen 8 2. PRELIMINARIES

(a) Base surface (b) Tesselated surface (c) Displaced surface

Figure 2.1: Per-vertex displacement mapping tessellates the base surface and displaces each vertex ac- cording to the height map. in Figure 2.1. The mesostructure normals are computed as described above and set for each vertex to achieve correct illumination. Of course, per-vertex displacement mapping introduces a lot of additional polygons, which contradicts the reason to use actually use height maps in the first place. On the other hand, this method allows an adaptive refinement of the mesostructure, and if the tessellation is done on the GPU in a geometry shader, the overhead can be much lower.

2.1.2 Per-pixel displacement mapping

Per-pixel displacement mapping does not actually change the geometry like per-vertex displacement mapping does, but searches the displaced point of the mesostructure surface for each rendered pixel individually. This is done in the fragment shader by tracing the view ray into the surface volume and finding the exact intersection points with the height map. There are several iterative methods for tracing rays into the surface, which can be divided in unsafe methods and safe methods, and combinations of those.

viewing ray

1.0 (u,v)

height (u0,v0)

(s,t) 0.0

Figure 2.2: Per-pixel displacement mapping traces the view ray for each pixel to find the exact intersection point with the height map.

Figure 2.2 shows an example, where a ray entering the surface volume at point (u,v) with height 1 exits the volume at a point (s,t) at height 0; the intersection point (u0,v0) of the ray with the height map will then always be somewhere on the line between (u,v) and (s,t). 2.1. ROUGH SURFACES 9

Unsafe methods Unsafe methods exploit the fact that the intersection of a ray with the surface at height 1 is always above the detailed mesostructure surface, the intersection at height 0 is always below. Starting from these two points, a binary search[POC05], or more sophisticated procedures like the secant method[RSP05], can be used to find the exact intersection point with the height field. While being very fast, these unsafe methods may skip surface details, and will therefore not always return the first intersection points of the ray with the surface, which is actually the one being seen, but an arbitrary one. This behavior can lead to unpleasant sampling errors.

Safe methods To always find the first intersection point, a linear search[MM] or a method using distance functions is needed. For a linear search between the start point at height 1 and the end point at height 0, the algorithm moves along the ray and with uniform steps and tests, if the current point is already below the surface. The linear search may skip small features of the height map as well, if the step size is chosen too large, however for sufficiently small steps a linear search will always find the first intersection. Distance functions store additional information from which the next step size along the ray can be computed, such that it never pierces through the surface and skips surface details. These distance functions also allow skipping space much faster when the currently sampled point is still far from the surface, hence making per-pixel displacement mapping with distance functions usually faster and more robust than a linear search.

viewing ray

1.0 (u,v)

height (u0,v0)

(s,t) 0.0

Figure 2.3: Per-pixel displacement mapping using the cone stepping distance function

Examples for such displacement mapping methods using distance functions are sphere tracing [Don05], pyramidal displacement mapping [OKL06] and cone stepping [Dum06]. In cone stepping, a cone is stored for each surface point that touches the surface, but never intersects it. Using these cones, the ray can be traced by going forward to the intersection of the ray with the current surface point cone at each step; the sampled points on the ray will then converge to the intersection point, as in Figure 2.3.

Combinations of unsafe and safe methods It is possible to combine both safe and unsafe methods and merge their respective advantages, i.e. having the fast convergence rate of unsafe methods while ensuring that always the first intersection is found. This is done by iteratively doing safe steps until the first point below the detailed surface is found, then refining this point using an unsafe method. Examples of such combinations of different methods are iterative parallax occlusion mapping [BT04], which combines linear and binary search, and relaxed cone step relief mapping [PO08], which combines binary search with a relaxed variant of cone stepping. 10 2. PRELIMINARIES

Some of the methods using distance functions pose additional restrictions on per-pixel displacement mapping. For example, cone stepping and relaxed cone step relief mapping only work correctly for rays pointing towards the surface. For rays pointing outwards, for example if the camera is inside the surface volume, these methods may skip surface details. For a more thorough discussion of both per-vertex and per-pixel displacement mapping with many different variants, see [SKU08].

2.1.3 Curved surfaces

The per-pixel displacement mapping methods discussed before can not be used directly to render height fields on curved surface, since they all assume the base mesh to be a flat square—and therefore the surface volume to be a cuboid. Therefore, these methods would not be able to find the correct silhouette of the detailed objects. For this, several methods have been proposed, which can be divided into two classes: Given a polygonal base mesh, the first class extrudes each polygon into a prism and renders all these prisms [HEGD04]. For each prism, all intersection points of the ray and the height map inside the prism are drawn, while all intersections outside of the current prism are discarded—they will be found in another prism (which possibly has a different tangent space). This approach works well, however, it introduces a lot of new triangles and therefore significantly decreases the performance.

(a) Rays in object space (b) Rays in tangent space

Figure 2.4: Per-pixel displacement mapping for curved surfaces. Instead of tracing the straight view ray in object space, the ray itself is curved in tangent space.

The second class of algorithms uses depth maps and instead of curving the height map according to the curvature of the base surface, they bend the ray accordingly (see Figure 2.4). These approaches usually trace the ray using linear steps, as the usual distance functions do not work for curved surfaces. To bend the ray, the methods need to know the curvature of each point on the surface. For this, curved relief mapping [OP05] approximates the local geometry at each vertex with a quadric and uses this to curve the ray. However, using only local information may miss some important intersections in difficult cases, as noted in [CC08]. There, the authors introduce another approach based on normal and tangent maps, which can be used to look up the tangent space for each point on the base surface. Using these maps to update the tangent space of the ray in each step yields an easy and robust method with good rendering results.

2.2 Varying reflectance

Surfaces can have very different reflection properties, ranging from perfect mirrors and highly specular materials like calm water to perfect diffuse materials such as clay or chalk. Also, the roughness of the microstructure has a large influence on the reflectance: rough sea looks very different to calm sea, even if one is too far away to see its individual waves. Additionally, a surface may be anisotropic like 2.2. VARYING REFLECTANCE 11 brushed metal, where the reflectance changes significantly when the surface is rotated around its normal. To describe these and other effects of the reflection properties of a surface, bidirectional reflectance distribution functions are used.

2.2.1 BRDFs

The bidirectional reflectance distribution function (BRDF) defines the ratio of radiance emitted in direction ωr = (θr,φr) for incident irradiance coming from a light source in direction ωi = (θi,φi). For these directions, θ denotes the inclination, i.e. the angle to the surface normal, and φ denotes the polar angle, i.e. the rotation around the normal.

n

ωi

Figure 2.5: A BRDF, showing the reflected radiance of a surface with normal n for light coming from direction ωi

Figure 2.5 shows an example BRDF for a fixed incident direction ωi. This BRDF is mostly diffuse, reflecting the light in all directions except for a small glossy highlight in mirror direction. Of course, the BRDF is defined for all directions, not only those in the plane of incidence as shown in the figure. Furthermore, the BRDF is defined for all incident directions ωi.

As such, the general definition of a BRDF is given as a four-dimensional function fr(θr,φr;θi,φi). However, often the BRDF is rotationally invariant, i.e. the surface can be rotated around its normal vector without changing the reflected radiance. These isotropic BRDFs only depend on the difference between angle φr and φi, and can therefore be defined by a three-dimensional function fr(θr,θi,φr − φi).

To compute the color Lr of a surface points (its radiance) from the BRDF, the BRDF is evaluated for the view and light directions in tangent space and multiplied with the light intensity E0 and the cosine of the inclination of the incident light cosθi.

2.2.2 Models

Since general BRDFs for quantized view and light directions need a lot of storage space and are hard to define if no measurements are available, many models have been introduced to computer graphics that only depend on a few parameters. Some of the most popular models include: • the Lambert model for perfect diffuse materials, where all incoming light is scattered uniformly in all directions, • the Phong model, a very common model for glossy surfaces, being fast to compute, • the Blinn-Phong model, a variant of the Phong model which behaves physically slightly more plausible, • the model by Oren and Nayar for rough diffuse surfaces which exhibit retro-reflection, • the Cook-Torrance model, the first micro-facet based physical model for specular materials, and • the Ward model for anisotropic reflectance Of course, there are many more models used in computer graphics. A survey on many of these models can be found in [KE09]. In the remainder of this section we will briefly introduce a few of these models. Most 12 2. PRELIMINARIES

σ = 0.0 σ = 0.3 σ = 0.6 σ = 1.0

(a) Lambert reflection model (b) Oren-Nayar reflection model for different roughnesses σ

Figure 2.6: Comparison of diffuse reflection models for different angles of incident light

of these models use the following variables: n, the normal vector of the surface; l the direction vector to v+l the light source; v, the view direction; and h = |v+l| , the half vector between l and v.

Lambert One of the easiest reflection models is the Lambert model for ideal diffuse reflection. Lambertian surfaces reflect all incident light uniformly in all directions, as seen in Figure 2.6a, hence the BRDF for this model is just a constant:

ρ f = , and (2.2) r π ρ L = (n · l) E . (2.3) r π 0

The parameter ρ is called albedo and specifies the total amount of light that is reflected. This model is very cheap to evaluate, only depending on the angle between light direction and surface normal θi, and is often sufficient for describing diffuse surfaces.

Oren-Nayar However, very rough diffuse materials often exhibit additional properties like back-reflection that can not be described by the Lambert model. For example, a full moon looks more like a disc than like a Lambert-shaded sphere, as in Figure 2.6b for high roughness values. To describe these properties, Oren and Nayar developed a physical model of diffuse surfaces based on micro-facets [ON94], which depends on an additional roughness parameter σ. According to this model, each micro-facet of the surface is treated as an ideal diffuse (Lambertian) reflector. Since their model didn’t have a closed form solution, they also derived a simplified qualitative 2.2. VARYING REFLECTANCE 13

s = 20 s = 50 s = 200 m = 0.05 m = 0.2 m = 0.4

(a) Blinn-Phong reflection model for different shininess (b) Cook-Torrance reflection model for different rough- exponents s nesses m

Figure 2.7: Comparison of specular reflection models on a diffuse sphere for different angles of incident light model, which is most often used in computer graphics: ρ L (θ ,θ ,φ − φ ;σ) = E cosθ (A + Bmax(0,cos(φ − φ ))sinθ tanθ (2.4) r r i r i π 0 i r i + − σ 2 A = 1 − 0.5 (2.5) σ 2 + 0.33 σ 2 B = 0.45 , (2.6) σ 2 + 0.09 where θ+ = max(θr,θi) and θ− = min(θr,θi).

Blinn-Phong Blinn-Phong is a simple model for describing glossy surfaces [Bli77]. While the complete Blinn-Phong model describes surface reflection as a sum of a diffuse, specular and ambient term, we will only use the specular reflection here (see Figure 2.7a). It can be computed as the cosine between surface normal n and half vector h, raised to a power s called shininess—yielding a cosine lobe:

(n · h)s f (n,l,v;s) = k , and (2.7) r s n · l s Lr = ks E0 (n · h) . (2.8)

Cook-Torrance The reflection model by Cook and Torrance is another physically based micro-facet model [CT82]. Each micro-facet is treated as a perfect mirror. The Cook-Torrance model shows some effects of realistic surfaces that Blinn-Phong ignores, like off-specular reflection and a color-shift of the depending on the directions ωr and ωi, as seen in Figure 2.7b. The Cook-Torrance model is a product of three terms: the normal distribution function D, the Fresnel term F and the geometric attenuation function G. For real-time computer graphics, Schick’s approximation for the Fresnel term [Sch94] is usually used. 14 2. PRELIMINARIES

A common normal distribution function is the Beckmann distribution. With this, the Cook-Torrance model can be described as follows:

DFG L = , (2.9) r π(l · n)  2(n · h)(n · v) 2(n · h)(n · l) G = min 1, , , (2.10) (v · h) (v · h)

1 −( tanα )2 D = e m , and (2.11) m2 cos4 α 5 F = F0 + (1 − F0)(1 − v · h) . (2.12) where α is the angle between surface normal n and half-vector h, F0 is the reflectance at normal incidence and m is a roughness parameter.

2.3 Filtering

Combining a displacement mapping algorithm like relief mapping and different reflection models, rough surfaces with varying reflection can already be rendered. However, both the geometry and the varying reflection are susceptible to aliasing when rendered naïvely at small resolutions—giving rise to distracting artifacts. This aliasing follows from the sampling theorem, which states that the sampling frequency must be higher than twice the highest signal frequency. However, for small resolutions, the pixel grid, on which the surface is rendered, will at some point have a smaller frequency than twice the highest frequencies of the surface details. This results in distracting Moiré patterns or other unwanted distortions and artifacts. To ensure a correct reconstruction, the high frequency details of the surface must be removed before rendering. This is usually done by filtering the surface textures. For color textures, the standard approach for this type of filtering is called mipmapping and is described in the next section. However, varying surface reflection parameters like specularity can not be correctly handled with mipmapping—even less if the surface reflectance can vary arbitrarily. Fortunately aliasing caused by the surface geometry is mostly not as obvious and can be ignored for per-pixel techniques like relief mapping, as suggested in [PO08]. However, geometry and variable surface reflectance interact in a way that makes correct filtering even harder.

2.3.1 Mipmapping

Mipmaps (from MIP=multum in parvo, much in little) were introduced by Williams in [Wil83] and have since been a standard tool for filtering in real-time computer graphics. The basic approach is to construct 1 a sequence of images, where each image is a filtered version (with 4 the texels) of the previous image. This sequence, also known as Gaussian pyramid (although other filters may be used, for example a simple box filter), is then used for rendering, where for each pixel the pyramid layer is selected, for which a texel spans the whole pixel. This way, a prefiltered texture without high frequency details is used for drawing and thus aliasing is reduced. The selection of a pyramid level is done by computing a level of detail (LoD) parameter from the distance of the object from the camera and the angle α between surface normal and viewing direction, as the visible area of the surface is inversely proportional to cos(α). The LoD parameter can also be used to interpolate between adjacent mipmap levels to remove sudden changes in sharpness that would otherwise occur at LoD changes. 2.3. FILTERING 15

However, these mipmaps can not be used for arbitrary surfaces as explained earlier. For these, more sophisticated methods were developed.

2.3.2 Transition between bump rendering algorithms

Becker and Max did early work on how to overcome the problem of filtering parametrized surfaces [BM93]. Their approach is based on the hierarchy of scales mentioned by Kajiya in [Kaj85], which is useful for modeling the complexity of natural surfaces. In this work, the hierarchy is interpreted as the geometry (displacement mapping), mesostructure (bump mapping) and lighting model (BRDF). Basically, the surface is rendered with these three algorithms based on its distance to the viewer (and hence, its resolution): for close surface points displacement mapping is used, surface points a bit further away use bump mapping and surfaces very far away are rendered simply with the BRDF. However, standard bump mapping is not consistent with displacement mapping, since normal vectors may be occluded and hence the intensity of the surface drawn with bump mapping can be different from the one drawn with displacement mapping. The authors solve this problem by redistributing the perturbed normals from bump mapping to achieve the same statistical distribution as with displacement mapping. Of course, this requires the measurement of visible normals for each viewing direction, from which these redistribution functions (one for each direction) are created. Similarly, the BRDF is computed from the normal distribution as well to be consistent with displacement and bump mapping. With these three algorithms being consistent, the renderer can choose an appropriate algorithm for each pixel and interpolate between these algorithms when necessary. This yields a surface that looks as if it was fully displacement mapped even though large parts of it are in fact rendered with much cheaper algorithms. Besides its effect on runtime efficiency, the using of BRDFs for small resolutions also reduces aliasing. However, this approach also has a range of restrictions. Firstly, there are only three steps in the rendering pipeline: The surface is drawn using displacement mapping, bump mapping or a constant BRDF. Further- more bump mapping and displacement mapping are in a sense similar, as both can be used for rendering at the same resolutions, with the only real difference being occlusion and silhouettes, i.e. the reason why bump mapping is used instead of displacement mapping is only because it is faster, not because it is better for smaller resolutions (which it is not). This means that the bump map must contain only a single frequency or a narrow frequency range, otherwise the transition point from bump mapping to BRDF is not clear—either it occurs too early and low frequencies in the bump map are blurred or it occurs too late and high frequencies cause aliasing artifacts. The restriction to a single frequency can be lifted by layering multiple maps with different frequencies and choosing the rendering algorithm for each of them separately, but it is still a major restriction. A solution to this problem would be not to transition directly from bump mapping to a single BRDF, but to have a pyramid of filtered normal distributions, i.e. a mipmap, however not from a color texture, but from the measured normal distributions (so we cannot simply use Gaussian filters). Then the last layer would be equal to the normal distribution of the whole surface, from which also the BRDF is constructed, and all other layers would contain the normal distribution for smaller and smaller parts of the texture. Another restriction of the given approach is occlusion, which only changes the normal distributions but completely ignores all other effects. For example, when used in conjunction with a color texture, the bumps may only occlude red texels, but blue texels may be visible. This approach however always renders the surface point using the same color. Also, the approach does not handle shadowing. Similar to occlusion, shadowing may have different effects, such as changing the distribution of surface normals that are illuminated (and hence, can be used for light computation) and changing visible color texels. 16 2. PRELIMINARIES

Finally, the basis BRDF of the surface must be constant—the approach cannot be used for surfaces with varying reflection, for example a glossy surface with diffuse bumps. To solve the last problems, we note two things: 1. To correctly handle occlusion and varying surface reflectance, we need to consider the whole BRDF instead of simply the normal distribution. 2. In the given approach, the normal distribution is calculated for each view direction, because it may cause normals to be masked. Similarly, the light direction may cause normals to be shadowed (and therefore also not taking part in calculating the reflected light). Therefore, to handle shadowing, we have to compute the normal distributions (or BRDFs) for each view and light direction.

2.3.3 BRDF mixture models

In [TLQ+05], Tan et al. start from the observation that the reflectance of a pixel (resp. the BRDF) at lower resolution can be considered as a combination of the reflectance of several pixels (resp. BRDFs) at higher resolution. To represent these BRDF combinations, they use a mixture model that combines several weighted single BRDFs: N B(l,v) = ∑ αiBi(l,v) (2.13) i=1 Using these mixture models, a reflectance mipmap is constructed, where each level contains parameters for N-component mixture models. This is done by aligning the components of neighboring texels and then combining them into a new N-component mixture model for the next mipmap level. Since the single BRDFs for neighboring texels are aligned, the mipmap can be filtered using standard trilinear hardware filtering. The paper contains the algorithms for this using mixture models consisting of multiple Gaussian or cosine lobes. However, even though the method can combine models aligned to different normal vectors, it is only correct for flat surfaces. The reason is that BRDFs are always combined, even if some surface points are not visible due to self-occlusion of the surface—their BRDF, however, still appears as a component in BRDF mixtures at higher mipmap levels. In [TLQ+08], the authors describe how their method can be extended to handle self-shadowing and masking of the surface. This is done by computing a horizon map and filtering this horizon map as a mipmap by aggregating the horizon angles into horizon map distributions, again fitted using a Gaussian lobe model. From these horizon map distributions the ratio of illuminated facets can be computed and used for attenuating the BRDF. However, again this is not completely correct, since this method for self-shadowing does not take into account which BRDFs are shadowed - the method only computes a factor that should handle self- shadowing. An example where this fails is again the mountain range: For grazing angles of the sun near sun-rise or sun-set, the lush, green valleys will still be in deep shadow, while the snow-topped peaks will be illuminated. Here, scaling the sum of the green valley-BRDFs and white mountain-BRDFs by a single factor is not enough to capture the correct appearance of the surface. Therefore, a solution must distinguish between the different view and light directions when combining the BRDFs—i.e. we would need not only a single BRDF pyramid, but one for each view and light direction.

2.4 Bidirectional texture functions

In the previous section we saw that a solution for the filtering problem of rough surfaces with varying reflection properties must compute the BRDF for each surface point and for each view and light direction, 2.4. BIDIRECTIONAL TEXTURE FUNCTIONS 17 and filter these BRDFs again for every view and light directions, only adding those that are not occluded or shadowed. Of course, BRDFs are also defined in terms of view- and light direction; this means that, for a given view and light direction, we would first have to find the corresponding BRDF and then evaluate the BRDF for exactly this view and light direction. Clearly, such a strategy is not useful; instead it is enough to only store the value of each BRDF for exactly that view and light direction—and that is exactly what a bidirectional texture function (BTF) does. A BTF, introduced by Dana et al. in [DNGK97], is a seven-dimensional function which maps for each surface point p = (x,y) and each wavelength λ the reflected radiance in view direction ωr = (θr,φr), when the surface is illuminated from a light in light direction ωi = (θi,φi):

BTF(p,λ,ωr,ωi) (2.14)

Typically, BTFs are measured from real-world surfaces by taking RGB pictures for a discrete number of light and camera positions spanning the whole hemisphere, which are then rectified and resampled to a given resolution. So for a resolution n × n, p camera positions and q light positions, the BTF consists of n · n · p · q · 3 radiance values. BTFs are currently one of the most accurate representations of real-world material appearance and can be used for highly realistic rendering.

......

n q

textures ABRDFs q · · n n p p n (a) Texture representation (b) ABRDF representation

Figure 2.8: Two different BTF representations of a BTF

Instead of treating the BTF as a set of p · q images for each view and light direction, it can also be seen as a set of n · n pixel-wise apparent BRDFs (ABRDF). These ABRDFs are called apparent since for rough surfaces the exact surface point of a given pixel (x,y) may differ slightly for different view directions, hence the ABRDF may violate the two BRDF properties of reciprocity and energy conservation. Figure 2.8 shows both representations for a synthetic BTF. A measured BTF automatically contains effects produced be the surface geometry and reflectance, such as self-occlusion, self-shadowing, interreflection and subsurface scattering. However, when approximating a surface by a BTF, the contour of curved objects will be that of the approximated base surface, not the original one. The biggest problem of using BTFs, especially in real-time systems, is their large size. For the reso- lutions of n × n = 64 × 64 pixels, p = 81 camera positions and q = 81 light positions and 24-bit RGB measurements, as used in [MCT+05], this already amounts to 76.9MiB. This shows the need for powerful compression methods for BTF data.

2.4.1 LOD for BTFs

Filtering for BTFs was introduced by Ma et al. in [MCT+05]. The idea is quite easy: If the BTF is treated as a set of textures as in Figure 2.8a, each such texture can be filtered individually by constructing a 18 2. PRELIMINARIES mipmap. Then the BTF renderer can simply use the prefiltered versions. Similarly to [TLQ+05], this works since the resampling of the BTF textures actually combines several ABRDFs into a single ABRF; these apparent BRDFs automatically take into account self-occlusion, self-shadowing and so on, i.e. all directional effects. This means that whenever a given surface point is occluded for some view direction, the textures for this view direction do not contain the reflectance values for that surface point - and hence the filtering does not introduce wrong results. In contrast to [TLQ+05], the ABRDFs are, of course, not approximated by mixture models but stored explicitly. For another intuitive way to see why this filtering is correct, recall that the surface rendered at a small resolution should look like a resampled version of the rendering at large resolutions. Now since the texture of BTF captures exactly the appearance of the surface for a given view and light direction, resampling during mipmap creation achieves exactly this.

2.4.2 Compression and rendering

Rendering the uncompressed (and possibly prefiltered) BTF is quite easy. By projecting the view and light directions into tangent space of the surface point and comparing the vector to all sample view and light directions, the index of the closest direction can be found. Then, rendering simply consist of looking up the BTF value at the given texture coordinate and view and light direction index. Of course, the quality can be increased by interpolation between the three next view and light directions instead of taking only the closest. However, since the original BTF dataset is usually too large for direct rendering, various compression methods have been developed. A good overview and comparison of BTF compression methods can be found in [FH09]. Not all compression methods are applicable for real time rendering. Compression methods usable for real-time rendering can be divided in two categories. The first category regards the BTF as a set of ABRDFs for each texel, and compresses these using various techniques. Methods in the second category, which consistently achieve better reconstruction quality, are based on a linear factorization of the BTF into a set of eigen-ABRDFs and eigen-textures, which can be decompressed by computing the sum of the product of eigen-ABRDFs and eigen-textures for the given surface point and view and light direction. This factorization, using primary component analysis respectively singular value decomposition, can be computed for the whole BTF at once, or separately for parts of it, for example for each view direction [SSK03] or by clustering the BTF space [MMK03]. Factorization of the whole BTF at once, as proposed by Koudelka et al. in [KMBK03], results in better compression, however, it is usually more expensive to reconstruct the BTF values, since more components (i.e. eigen-ABRDS and eigen-textures) need to be combined. The compression method used in [MCT+05] is similar to the factorization of the whole BTF at once, but the BTF images are segmented into their frequency bands using Laplacian pyramids, and each level is factorized separately. This compression elegantly incorporates prefiltering of the BTF images and achieves a good quality at quite small data sizes. Similarly to factorization of the whole BTF, it is not the fastest method for reconstruction, since many components are needed for accurate results, but real time rendering is still possible. In total, it seemed like an efficient compression method for our case. The next chapter will explain how this method works exactly. 19

3 BTF generation

A rough surface with varying reflection properties can be filtered using BTFs. In this section, we explain the preprocessing, i.e. the generation, filtering and compression of the BTF data, which will then be used for real-time rendering of the surface. An overview of all preprocessing steps is shown in Figure 3.1. It starts with a few textures, a depth texture and some textures describing the reflection properties of the surface, such as the color texture in Figure 3.1. Using the depth texture, a relief map is computed (1) that contains derivatives of the depth map and a distance function for each pixel that can be used to accelerate relief mapping. Using this relief map and the reflection textures, the relief mapping algorithm renders the surface for different view and light directions (2), yielding a set of rendered images. Each of these images is rectified (3) and converted to YCbCr (4); these two steps can improve the compression. For each of the images we generate a Laplacian pyramid (5), which splits the images into their different frequencies. This set of Laplacian pyramids is then compressed using principal component analysis (6), which yields a set of principal vectors and weight maps for each set of images of the same level in the Laplacian pyramids; and additionally a mean vector for the last level. In the last step, the principal vectors and the weights are packed into two textures. Together with the mean vector, these textures can then be used for the real-time rendering of the filtered surface.

1

2 3 4

5

7

6

Figure 3.1: Preprocessing of the surface data consists of the following steps: 1. computation of the relief map, 2. rendering of the surface using relief mapping, 3. rectification, 4. conversion to YCbCr, 5. computing Laplacian pyramids, 6. principal component analysis, and 7. texture packing. 20 3. BTF GENERATION

3.1 Surface rendering

The first two steps consist of rendering the surface for all view and light directions. Similar to [SSK03], we choose 81 different view and light directions ω = (θ,φ) according to Table 3.2a. The directions are shown in Figure 3.2b. Unlike the original paper, we use the same directions both for viewing and lighting.

θ[◦] ∆φ[◦] No. of images 0 – 1 17 60 6 34 30 12 51 20 18 68 18 20 85 15 24

(a) View and light direction angles (b) Half-sphere of all view and light directions

Figure 3.2: View and light directions used in the BTF representation

To actually render the surface structure using these directions, we use one of the per-pixel displacement mapping algorithms discussed in previous chapter. The rendering speed of this algorithm during the preprocessing stage is not important; however, since we want to use the algorithm also for real-time rendering, we decided on relaxed cone stepping for relief mapping, which combines high quality results with good performance. For each point on the surface, which was found using relief mapping, we have to compute the reflected radiance by evaluating the lighting models. We have chosen to implement four widely used lighting models: for diffuse reflection the models by Lambert and Oren-Nayar, and for glossy surfaces the models by Blinn-Phong and Cook-Torrance. These models can be combined using different textures. Using a ray-tracer instead of our relief-mapping renderer, it is possible to compute additional illumination effects during the preprocessing stage, such as subsurface scattering and inter-reflection. The generated BTF would then include these effects and could be rendered in real-time without any modifications to the BTF renderer. However, since we do not use a pure BTF renderer, but transition to the relief mapping renderer for larger resolutions instead, this would create inconsistent results (unless, of course, these effects could be incorporated into the relief renderer without loosing too much performance). Therefore we do not use these additional illumination effects.

3.1.1 Lighting models

We have chosen two simple models, which can be evaluated quickly, Lambert and Blinn-Phong, and two realistic micro-facet based models, Oren-Nayar and Cook-Torrance. That way, there is a choice between high quality models or better performance. Of course, other lighting models can be implemented if needed: If measured BRDF data is available, one might want to fit it using a LaFortune model. If modeled surface exhibits anisotropic behavior, this can be realized by implementing the Ward model, or by explicitly creating a depth map that yields these anisotropic effects, for example by using parallel v-shaped tranches for brushed metal. Most models can be directly implemented using the formulas from Section 2.2.2, however the direct implementation for Oren-Nayar is quite inefficient due to the high number of trigonometric functions. Substituting cosθi = n·l, cos(φr −φi) = (v−n(v·n))·(l −n(l ·n)), and using the trigonometric identities 3.1. SURFACE RENDERING 21

√ q 1−cosθ 2 sinθ = 1 − cosθ 2 and tanθ = − , the model can be written as: + + − cosθ−

ρ L (n,l,v;σ) = E (l · n)(A + Bmax(0,(v − n(v · n)) · (l − n(l · n)))C (3.1) r π 0 σ 2 A = 1 − 0.5 (3.2) σ 2 + 0.33 σ 2 B = 0.45 (3.3) σ 2 + 0.09 p(1 − (v · n)2)(1 − (l · n)2) C = . (3.4) max((v · n),(l · n))

This form is computationally much cheaper to evaluate. In this thesis, the evaluation of the lighting model adds three terms: An ambient illumination term and both a diffuse and specular term. One can choose which model is to be used for evaluating the later two terms—for the diffuse term either Lambert or Oren-Nayar and for the specular term either Blinn-Phong or Cook-Torrance. In total, three textures are used to store all model parameters:

1. The first RGB texture Ta defines the ambient component of each surface point. This component is always added to the result without any lighting calculation. 2. The second RGBA texture Td stores parameters for the diffuse term. The RGB channels contain the diffuse color for each surface point, i.e. the albedo ρ, while the alpha channel contains the roughness σ for the Oren-Nayar model. If the Lambert model is used as a diffuse term, the alpha channel will be ignored. 3. The third RGBA texture Ts stores parameters for the specular term. The RGB channels contains the specular color (kS for Blinn-Phong and F0 for Cook-Torrance), the alpha channel stores the shininess s for Blinn-Phong or the roughness m for Cook-Torrance. For example, if using Oren-Nayar and Cook-Torrance as lighting models, the radiance of a surface point will be calculated as

Lr = Ta.rgb + Lr,oren-nayar(ρ = Td.rgb,σ = Td.a) + Lr,cook-torrance(F0 = Ts.rgb,m = Ts.a). (3.5)

3.1.2 Per-pixel displacement mapping

To adequately render the surface structure, per-pixel displacement mapping is used, namely relaxed cone stepping for relief mapping [PO08]. As described in the previous chapter, per-pixel displacement mapping works by ray-casting the view ray for each rendered pixel into the surface and finding the intersection point of that ray with the height map.

Relaxed Cone Step Mapping The relaxed cone stepping method combines a safe search using a distance function to find the first point below the height map with a second phase, in which a binary search finds the exact intersection point between the point below the height map and the last point above the height map. The distance function that is used is a variation of cone stepping described in the previous chapter. The ray-casting procedure starts at the top of the surface volume and at each step follows the ray to its intersection with the cone at the current position (see Figure 3.3). The cones are defined in such a way, that following any ray to its cone intersection will never skip a surface peak – but the ray may pierce the surface such that the new position is below the surface height. In this case, the binary search can be started to find the exact intersection point. 22 3. BTF GENERATION

viewing ray

p0 0.0 ◦

p1 ◦ binary search

p2 ◦ p3 ◦ depth

1.0

Figure 3.3: Per-pixel displacement mapping using relaxed cone steps

Map creation In the original paper, the relaxed cone ratios are computed as shown in Algorithm 1: For each source texel (s,t) and each destination texel (u,v) a ray is traced starting in (s,t) at depth 0 through (u,v,depth(u,v)). The first intersection (q,r) after (u,v) is then used to compute the cone ratio (the ratio of the length from (s,t) to (q,r) by the differences of their depths in the depth map). The relaxed cone ratio for a given source texel is the minimal cone ratio (clamped to 1.0) computed for all destination texels.

Algorithm 1 Original relaxed cone ratio computation 1: for all source texels (s,t) of the depth map do 2: minRatio ← 1 3: source ← (s,t,0.0) 4: for all destination texels (u,v) of the depth map do 5: ray.origin ← (u,v,depth(u,v)) 6: ray.direction ← ray.origin − source 7: (q,r) ← traceNextIntersection(ray,depthMap) 8: if depth(s,t) > (depth(q,r) then 9: ratio ← (s − u,t − v).length / (depth(s,t) − depth(q,r)) 10: if ratio < minRatio then 11: minRatio ← ratio 12: end if 13: end if 14: end for 15: relaxedConeMap(s,t) ← minRatio 16: end for

This approach however is not correct, since the ray used for computing the minimum cone ratio always starts at depth 0.0 in the source texel, but the rays traced during rendering may reach the source texel 3.1. SURFACE RENDERING 23

0.0 ◦ 0.0

◦ ◦

w ◦ ◦

depth h depth

1.0 1.0 (a) Ray which yields the minimal relaxed cone ratio (b) Ray skipping surface features

Figure 3.4: Artifacts of the original algorithm for computing relaxed cone step maps

at different depth. When these rays have a more shallow angle to the surface than the rays used for computing the cone ratios, they may skip surface features as in Figure 3.4. To devise a correct algorithm for computing the relaxed cone ratios, we note that rays traced during the rendering of the surface can have arbitrary shallow incident angles, in the extrem case being parallel to the base surface. For these flat rays, we just have to check whether the surface height declines at the destination texel. This check however does not need raytracing, we can just compare the depth of the destination texel to the depth of the texel one step further. Furthermore, we do not need to check all destination texels. By going outward from the source texel and iteratively computing the cone ratios, we can stop once all texels inside the cone defined by the minimal cone ratio found so far have been checked - texels outside this cone can not yield smaller cone ratios. The correct algorithm with this optimization is shown in Algorithm 2.

Algorithm 2 Relaxed cone ratio computation 1: for all texels (s,t) of the depth map do 2: radius ← 1 3: depth ← depth(s,t) 4: minratio ← 1 radius 5: while depthmap.width < depth · minratio do 6: for all texels (s0,t0) = (s,t ± radius) or (s ± radius,t) do 7: step ← depth((s0,t0) + (s0 − s,t0 −t).normalize) 8: if depth(s0,t0) < depth(s,t) and depth(s0,t0) < step then (s0−s,t0−t).length 9: ratio ← (depth(s,t)−depth(s0,t0)) 10: if ratio < minRatio then 11: minRatio ← ratio 12: end if 13: end if 14: end for 15: radius ← radius + 1 16: end while 17: relaxedConeMap(s,t) ← minRatio 18: end for 24 3. BTF GENERATION

3.2 Image rectification

The rendering via relaxed cone step relief mapping creates images showing the surface in orthographic projection for all view and light directions. These images could be used directly for BTF rendering. However, this has three disadvantages: • The rendered images are comparatively large, while the surface they contain can be quite small; most of the rendered images consist of empty background. This is not very effective. • Pixels in images with different view directions are not correlated at all, since the surface was rotated differently. This property, however, is important for good compression. • When rendering the BTF, we need to access a pixel of the rendered surface by the texture coordinates of the entry point of the ray that was traced to generate this pixel. To do this, we would need to apply the orthographic projection for each pixel again during BTF rendering. This is an expensive operation. To overcome these disadvantages, we use an additional transformation.

3.2.1 Transformation

Despite trying to solve the disadvantages, a good transformation must have one additional property: Since mipmap-generation filters and resamples the images, thereby combining neighboring pixels, these neighboring pixels in the original rendered images must be neighbors in the transformed image as well. This can be achieved by using affine transformations.

Figure 3.5: Transformation that rectifies the rendered images

Our transformation works as shown in Figure 3.5. It takes the smallest rectangle that contains the whole projected surface volume and maps it to a square texture. Additionally, it takes care that the vertex with texture coordinate (0,0,0) (or (0,0,1), depending on the view direction) of the original surface volume is always mapped to the lower left corner of the transformed image. This makes sure that all images for different view directions are aligned, i.e. the same pixels in two different images correspond to surface points of the original surface that are near each other. Of course, this correspondence cannot be perfect, since some surface points may be occluded for certain view directions, but visible for others.

3.2.2 Projection matrix

Instead of rendering the surface with an orthogonal projection and rectifying the result, it is easier to use a projection such that the rendered image is already rectified. 3.2. IMAGE RECTIFICATION 25

cosφr sin φr cosφr sin φ φ r r cos θ r

depth · sinθr

(a) surface volume from the top (b) . . . rotated by φr (c) . . . and tilted by θr

Figure 3.6: Transformation by the orthogonal projection for view direction (θr,φr)

To derive such a projection matrix, we have to see how the orthogonal projection for a view direction (θr,φr) transforms the surface volume. As seen in Figure 3.6, this projection can be composed of a rotation around the surface normal by angle φr and a tilting of the volume by angle θr, during which the top face of the projected volume is scaled in its vertical component by cosθr and the side faces of the π volume get a height of sinθr. In the following, we consider the case 0 ≤ φr < 2 .

sinφr β cosφr d α b a

c

Figure 3.7: Derivation of the projection matrix for BTF rendering (1)

To find the correct projection matrix, we have to find the length of the edges a, b, c and d in Figure 3.7. From Figure 3.6, we see that α results from angle φr being scaled in its vertical component by cosθr and ◦ therefore α = atan(tanφr · cosθr). Similarly, β = atan(tan(90 − φr) · cosθr).

cosφr cosφr sinφr Next, we see that cosα = a and hence a = cosα . Similarly b = cosβ . To find c and d, we take a closer look at the lower right corner of the quad in Figure 3.8, where we find an alternate angle to α and a congruent angle to β.

π 2 − α r θ c sin ·

α depth · β d π 2 − β

Figure 3.8: Derivation of the projection matrix for BTF rendering (2)

By the law of sine, we get c depth · sinθr π = . (3.6) sin( 2 − β) sin(α + β) 26 3. BTF GENERATION

This finally yields depth · sinθ π c = r sin( − β), and similarly (3.7) sin(α + β) 2 depth · sinθ π d = r sin( − α). (3.8) sin(α + β) 2

We see that the transformation of Figure 3.5, which rectifies the images, preserves the ratios a : c and b : d. Therefore, we can finally compute the projection matrix, which is used to render the surface volume directly to a rectified square, as a homogeneous matrix:  b  b+d 0 0 0 a  0 a+c 0 0 projection =  d c . (3.9) − b+d − a+c 0 0 d c b+d a+c 0 0

◦ ◦ ◦ This projection is valid for 0 ≤ φr ≤ 90 . For larger angles, we first compute the ratios using φr − k · 90 and then apply an additional rotation to the final projection matrix to make sure that the corner (0,0,0) of the original surface volume is always mapped to the lower left corner of the transformed image.

3.3 Filtering & Compression

As previously discussed, a BTF can be filtered by creating a Gaussian pyramid for each image rendered for a given view and light direction. For filtering and compression, we used the same general approach as Ma et al. in [MCT+05]. As a first step, the images is converted from RGB color space to YCbCr. In this color space, colors are described by the luminance component Y and the chroma components Cb and Cr. This conversion is used since the chroma components Cb and Cr can be compressed to a higher degree, since human vision detects small differences in brightness much better than small differences in hue. Compression of the mipmaps is done using principal component analysis (PCA). Since for PCA we need to store both the mean and the set of eigenvectors, there is a possible optimization: we can use a Laplacian pyramid, in which the mean is zero, instead of a Gaussian pyramid. This removes the need to store all mean vectors, while on the other hand being more costly to reconstruct.

3.3.1 Laplacian pyramid

The Laplacian pyramid decomposes the BTF images into a set of band-pass filtered images with decreasing frequencies. Besides having zero mean, these band-pass filtered images are highly compressible. The Laplacian pyramid consists of difference images of the Gaussian pyramid. The Gaussian pyramid ωi,ωr ωi,ωr can be computed starting from the original BTF images G0 = I by repeatedly applying a low-pass ωi,ωr ωi,ωr filter f and downsampling the result to half its size to get the next image Gi+1 = downsample( f (Gi )) of the Gaussian pyramid. From this Gaussian pyramid, the Laplacian pyramid is computed taking the difference of each image ωi,ωr ωi,ωr of the Gaussian pyramid with the upsampled version of the next level of the pyramid: Li = Gi − ωi,ωr upsample(Gi+1 ). In this thesis, we use a simple box filter as low-pass filter and the nearest neighbor for upsampling. The Laplacian pyramid has one level less than the Gaussian pyramid, since we cannot compute the difference for the last level of the Laplacian pyramid. With knowledge of this last level of the Gaussian 3.3. FILTERING & COMPRESSION 27 pyramid and the whole Laplacian pyramid, we can reconstruct the original BTF image (and all levels of the Gaussian pyramid) by repeatedly upsampling the last reconstructed image and adding the next level of ωi,ωr ωi,ωr ωi,ωr the Laplacian pyramid: Gi = Gi+1 + Li . ωi,ωr ωi,ωr The final filter pyramids Fk will therefore contain one level Gn of the Gaussian pyramid and all ωi,ωr larger levels of the Laplacian pyramid Lm , where m < n. In practice, the largest level of the Laplacian pyramid has the size 64 × 64, whereas the last level of the filter pyramid, i.e. the Gaussian level, is not of size 1 × 1, but 4 × 4 instead. The reason is that for this size the number of pixels in this Gaussian level is close to the number of primary components used for compression, and hence the Gaussian level can be perfectly compressed.

3.3.2 Principal component analysis

The compression works by applying PCA to the whole BTF space. This is done by taking all BTF images ωi,ωr of the same level k in the filter pyramids Fk (p) for all view and light directions ωr and ωi, resulting in the filtered BTF Fk(p,ωi,ωr) for level k.

ωi,ωr p The images Fk (p) can be regarded as a set of ABRDFs Fk (ωi,ωr), and each ABRDF can be treated p as a large vector of 81 · 81 components. Then, PCA can be performed on all these vectors Fk for level k to find the n most dominating principal components (also called eigen ABRDFs) ek j, with 1 ≤ j ≤ n. PCA is run separately for each component Y, Cb and Cr and for each level k of the filter pyramid.

Besides the principal components, weights dk j, with 1 ≤ j ≤ n, are needed for the approximative recon- struction of the filtered BTFs. These weights are linear factors for the corresponding principal components, and are computed for each surface point p. Since the principal components are perpendicular to each other, the weights can be computed using a simple scalar product:

dk j(p) = ∑ Fk(p,ωi,ωr)ek j(ωi,ωr). (3.10) ωi,ωr

ωi,ωr Using the extracted principal components ek j and the weights dk j, the filtered images Fk can be approximately reconstructed as follows: n Fk(p,ωi,ωr) = ∑(dk j(p)ek j(ωi,ωr)). (3.11) j=0

Last filter pyramid level As discussed before, the the last level k of the filter pyramid is a Gaussian filtered image instead of a ωi,ωr Laplacian image. These images Gk (p) can again be regarded as a BTF Gk(p,ωi,ωr) and compressed the same way using PCA. However, this time the mean vector µk is not zero and has to be stored separately (and also used for reconstruction). In [MCT+05], the authors suggest to fit the Y component of the mean vector using a Blinn-Phong model

s µk ≈ ka + kd(n · l) + ks(n · h) (3.12) and approximate the Cb and Cr components by a constant. However, this is not a good approximation in most cases. This can easily be seen for the introductory example in Chapter 1: for the case of a mountain range with snowy peaks and lush, green valleys, when viewing from the top the mean color will be a mix of green and white; when viewing from flat angles, the valleys will be occluded and the mean color will be mostly white. For this reason, in many cases the mean color indeed changes with view and light directions. Therefore we fit a Blinn-Phong model to all three components Y, Cb and Cr, which yields a total of 10 parameters: three components each for ka, kd and ks; and the shininess s. The fitting results in an 28 3. BTF GENERATION optimization problem that can be solved using the Levenberg-Marquardt algorithm. Alternatively, we can use the mean ABRDF directly.

Applying PCA to a whole filtered BTF Fk as described has one obvious disadvantage: Due to the large data size, PCA will need a long time to finish. However, compared to most other BTF compression approaches, which enable real-time rendering, the compression ratio is much better for the same visual quality. Also, since the compression is done in a preprocessing step, the runtime of the algorithm is not too important.

3.3.3 Texture packing

To reduce the number of needed texture units, all results from the PCA stage are packed into two large textures. The first texture contains all the weights, the second texture the eigen ABRDFs. Figure 3.9 shows a layout for 8 eigen ABRDFs for Y and 2 eigen ABRDFs for Cb and Cr each, denoted by 8/2/2. For the case of 16/4/4 eigen ABRDFs, we simply double the texture size to store the additional components.

··· Cb1−2 L1 1−2 1−4 5−8 Cr 1−2 YL YL L1 Cb 2 2 L0 Cr1−2 L0 Y 1−4 Y 5−8 L1 L1

Y 1−4 Y 5−8 L0 L0

Y 1−4 Y 1−4 Y 1−4 Y 1−4 Y 1−4 L0 L1 L2 L3 G4

Y 5−8 Y 5−8 Y 5−8 Y 5−8 Y 5−8 L0 L1 L2 L3 G4

Cb1−2 Cb1−2 Cb1−2 Cb1−2 Cb1−2 L0 L1 L2 L3 G4 Cr1−2 Cr1−2 Cr1−2 Cr1−2 Cr1−2 L0 L1 L2 L3 G4

Figure 3.9: Layout for the textures containing the eigen ABRDFs ek j (bottom) and the weights dk j (top)

3.4 Handling of empty border pixels

While the transformation of the rendered BTF images will use the size of the images as effectively as possible, for all view directions except directly from the top, there will be empty border pixels, i.e. pixels for which relief mapping did not find an intersection with the surface. These empty border pixels will not show up during rendering, since they do not correspond to a surface point, but they will affect the compression. Just setting the color of these pixels to black introduces visible compression artifacts. Therefore it is important to find a good way to deal with these empty border pixels. In the following, we will present two different approaches. Both are based on the filtering, which has to take into account empty border pixels and lets the box filter only add those pixels that are not empty. Both approaches differ in how they fill the empty border pixels for all images at each filter level. 3.4. HANDLING OF EMPTY BORDER PIXELS 29

Filling The first way is to simply fill these empty border pixels according to the non-empty pixels. The easiest of these filling approaches simply computes the average color of all rendered pixels of an BTF image and fills all empty pixels with this color. This can be done for every image of the BTF, i.e. for each view and light direction. More sophisticated approaches like an interpolation from the nearest filled pixels in the six dimensions of the full BTF (the texture coordinate and two directions) are also possible. However, this is usually not necessary, since we only want to reduce compression artifacts, and not get an visually convincing value for these (never to be rendered) border pixels.

Multiple PCA passes Another possibility is to use an PCA method that supports the presence of missing values. A simple one is the imputation algorithm [IR10], which iteratively estimates the missing values and applies PCA. This is done by first filling the empty value of the data matrix, e.g. as row-wise means (which is exactly the same as the average pixel color for each image discussed above). Then, PCA is applied and the resulting most important principal components are used to compute a new estimate for the missing values. These two steps can then be repeated until convergence. However, this algorithm is extremely slow, since it executes the normal PCA algorithm, which already takes quite long because of its large data matrix, multiple times. Faster algorithms that perform PCA in the presence of missing values are available. However, most of them are conceptually much more complicated. While this approach was expected to yield even higher compression quality, Chapter 5 shows that the reconstruction quality for surface compressed using multiple PCA passes is nearly the same as for the simple filling of the border pixels for each filter level. Therefore, it is usually not worth the effort. 30 3. BTF GENERATION 31

4 Rendering

In this chapter, we describe the rendering procedure, which renders the filtered surface in real-time. As previously discussed, the renderer combines the relief renderer introduced in section 3.1.2 with a renderer for the BTF data, to achieve high quality for all possible resolutions. We will start with a description of the BTF renderer, and then show how it can be combined with the relief renderer.

4.1 BTF rendering

The BTF renderer uses the compressed BTF computed during the preprocessing and maps it onto the rendered surface volume. For this, it first has to compute the texture coordinate for each point and the directions, which are used to index the BTF; then the BTF is decompressed and drawn. The decompression will use only some of the levels of the Laplacian pyramid, so that the result is rendered corresponding to the level of detail. Since there are only a discrete number of directions, but the rendering can be done from arbitrary directions, we also need some kind of interpolation. The decompression of the BTF is done per-pixel, hence the algorithms presented in the remainder of this chapter are ideally implemented as a fragment shader.

4.1.1 Projection & Directions

The BTF is indexed by 3 variables: the texture coordinate, and the view and light direction index. For the latter, we can just transform the view and light directions into tangent space of the surface and then look up the index of the nearest discrete direction used in the construction of the BTF. This lookup is done using a precomputed texture as in Figure 4.1, which returns the index of the nearest direction i = lookup(ω = (θ,φ)), given the polar angle φ and inclination θ of some arbitrary direction ω in tangent space. For the texture coordinate, we have to transform our three-dimensional surface volume coordinates into 2D texture coordinates to look up the BTF. In fact, the transformation is exactly the same as the one described

0 θ inclination

π 2 0 polar angle φ 2π

Figure 4.1: Texture to look up the nearest direction index for direction ω = (θ,φ) 32 4. RENDERING in section 3.2.2. So, to compute the texture coordinate, we have to compute the projection matrix based on the (nearest) view direction and transform the surface volume coordinates with this matrix. To reduce the overhead, instead of computing this transformation matrix, we store its 6 parameters in another precomputed texture for fast lookup.

4.1.2 Decompression

Once we have the direction indices and the texture coordinate, we can decompress the BTF. The result of this is a sum of the mean and several Laplacian levels, converted to RGB. If the mean was fitted to a Blinn-Phong model, it can simply be evaluated for the view and light direction in tangent space. If the mean was directly stored in a texture, then we can index this texture using the nearest view and light direction index. To the value of the mean vector, we add the primary components of Gaussian level of the filter pyramid and possibly several Laplacian levels, which were all compressed using PCA. The reconstruction of the data works as described in Section 3.3.2: as the sum of weighted principal components. So, for the nearest view direction index ir, the nearest light direction index ii, and texture coordinate t we get the value of the kth filter level using n primary components as:

n Lk(t,ωi,ωr) = ∑(dk j(t)ek j(ωi,ωr)). (4.1) j=1

The result is, of course, still in YCbCr color space, and must be converted to RGB for rendering. Taking all together, this yields the following algorithm:

Algorithm 3 Decompression Input: position p, tangent space TBN, view and light direction ωr and ωi 0 0 1: directions in tangent space: ωr = TBN · ωr, ωi = TBN · ωi, 0 0 2: nearest direction indices: ir ← lookup(indexmap,ωr),ii ← lookup(indexmap,ωi ) 3: projection matrix: proj ← lookup(projmap,ir) 4: texture coordinate: t ← proj · p T 0 T 0 0 s 5: result ← lookup(mean,ir,ii) or ka + kd((0,0,1) · ωi ) + ks((0,0,1) · normalize(ωr + ωi )) 6: for all filter levels k do 7: for all primary components c ∈ 1,...,n do 8: result ← result + lookup(weightmap,t,k,c) · lookup(eigenmap,ir,ii,k,c) 9: end for 10: end for 11: return YCbCrToRGB(result)

4.1.3 Level of Detail

To achieve level of detail (LoD) rendering, we use only some of the Laplacian levels depending on resolution. That is, whenever the camera is far away from the surface, we take the mean and add no or only a few levels of the filter pyramid, so that the rendered surface contains only the low frequencies; when the camera moves closer to the surface, we use gradually more and more pyramid levels for rendering, to include the higher frequencies. Given this idea, we define the LoD parameter as the logarithmic distance:

lod = log2(distance) + bias (4.2) 4.1. BTF RENDERING 33

One could additionally use the cosine of the inclination of the view direction in the definition, since for shallow viewing angles, the surface is shortened—or even better, use an anisotropic filtering procedure instead of the isotropic one in our case. However, we found the results with this simple distance metric adequate for our needs. Using this LoD parameter, the rendering function will decompress all levels of the filter pyramid for lod = 0 and only the last level of the pyramid for lod = kmax; or more specifically, for a given LoD parameter lod, the renderer will decompress exactly the filter pyramid levels Ffloor(lod) to Fkmax (see Figure 4.2).

(a) lod = 0 (b) lod = 1

(c) lod = 2 (d) lod = 3

Figure 4.2: Rendered images for different LoD parameters

The bias for the LoD parameter should be chosen such that for lod = 0 the size of the rendered surface when viewed along the normal direction is the same as the size of the largest level of the filter pyramid. This minimizes aliasing, while not introducing excessive blur. To prevent sudden changes when the lod changes by moving away from or to the surface, we additionally multiply the larges decompressed level of the filter pyramid Ffloor(lod) with a weight of lod − floor(lod). This achieves a smooth interpolation according to the LoD parameter.

4.1.4 Interpolation

If only the single nearest view and light direction is used for rendering, the rendered image will show discontinuities at points where this nearest direction jumps to the next one. To avoid these discontinuities, we need an additional direction interpolation. For this, there are two possibilities: In [MCT+05], the authors note that for the LoD representation of the surface it is enough to only smoothly interpolate the mean and still use only the nearest directions for the filter levels. Since the mean is the average of all pixel ABRDFs, it represents the basic of the surface, with the filter levels encoding the highlights and additional surface features, where interpolation might not be as important as for the mean. However, we suspect that discontinuities in these Laplacian levels are still noticeable. Hence the other option is to interpolate also between all of the filter levels, as it was done in [FH09].

As discussed in Section 3.3.2, the mean µk can either be approximated by a Blinn-Phong model or be stored directly. In case of a fitted model, the interpolation is easy: We can just evaluate the model for the exact view and light direction instead of using the nearest BTF directions. 34 4. RENDERING

0 0 R: r θ r θ

G: inclination inclination

π B: 2 π 2 0 polar angle φr 2π 0 polar angle φr 2π (b) The corresponding weights of these indices encoded (a) nearest directions encoded in R, G and B channel in the R, G and B channel

Figure 4.3: Index map, encoding the nearest three directions and their weights. This map is a floating point texture, where the indices are stored in the integer part, and the weights in the fractional part of each channel.

For mean vectors stored as an ABRDF texture and for all Laplacian levels, we need to find the three nearest BTF directions for both view and light, and compute the barycentric coordinates, i.e. the weights, such that the linear combination of these three directions yields the original view or light direction. Again, instead of computing these directions in the renderer, we use another texture which stores the indices and weights of the three nearest BTF directions for each direction ω = (θ,φ), as seen in Figure 4.3.

Algorithm 4 Interpolated decompression 1: result ← 0 2: for all three nearest view indices v do 3: look up projection matrix proj(v) 4: t ← proj · p 5: for all three nearest light indices l do 6: result ← result + weight(v) · weight(l) · decompress(t,v,l) 7: end for 8: end for

Then, the mean and the Laplacian level can be evaluated for each combination of the nearest view and light directions, in total nine times. The results of these evaluations are added together weighted with the product of the corresponding direction weights. We must also ensure that the projection used to find the texture coordinate for the current point changes for each new view direction. The full interpolation of a single filter level can be achieved as described in Algorithm 4.

4.2 Transition between the BTF and relief renderer

For rendering at high resolutions, that is when the camera is very close to the surface, the LoD filtering is not needed anymore and the rendering of the BTF data looks very blurred due to its low resolution. In this case, it is favorable to use the relief renderer from Section 3.1 instead, which was originally used to generate the BTF data. Here, we can use a similar approach to the one described in 2.3.2, where also different algorithms are used for different resolutions and there is a smooth transition between these. In our case, that means that the BTF-rendering approach is used whenever the LoD parameter is 0 4.3. CURVED SURFACES 35 or greater. For a LoD parameter of -1 or smaller, the relief renderer is used. For a LoD parameter −1 < lod < 0, both approaches are used to render a given pixel, and the results of the relief renderer and BTF renderer are weighted with −lod resp. lod + 1 to get a linear interpolation that varies smoothly for the whole LoD range. To get the correct silhouette, we always apply the per-pixel displacement mapping, even when using only the BTF renderer. However, the displacement mapping is then only used to discard fragments where no intersection is found; the found intersections are not used for BTF look up, but the original points on the surface volume are used instead. The relief renderer, on the other hand, uses the exact intersections found during displacement mapping for shading and additionally traces the shadow ray.

4.3 Curved surfaces

The relief renderer using relaxed cone stepping does not support curved base surfaces. To be able to use the BTF-based filtering method on curved surfaces as well, a different algorithm to perform the per-pixel displacement mapping is needed. As discussed in Section 2.1.3, there are many different methods. Since the prism-based methods are generally much slower as they create a lot of new triangles for the extrudes prism, we selected one of the methods that trace the bent ray in tangent space, namely Normal-based curved silhouette rendering [CC08]. This approach uses a normal and tangent map, from which the exact tangent space for each point on the base surface can be computed, giving accurate results for the ray bending. Curved relief mapping introduced in [OP05], on the other hand, uses a local approach, where the curvature of each point is approximated as a quadric; this curvature information can be used to compute the tangent space of other surface points, but the accuracy of this will decrease for long rays, possibly resulting in artifacts. On the other hand, curved relief mapping saves many texture look-ups, yielding better performance. The aforementioned normal and tangent maps for normal-based curved silhouette rendering are computed by rendering the normal and tangent of each triangle of the base surface in texture space, i.e. each vertex of the triangle is projected to its texture coordinates. For example, the normal and tangent map for a torus can be seen in Figure 4.4. In addition, the normal and tangent maps store scaling factors for the tangent and bi-tangent directions, which are needed since the texture coordinates are usually not uniformly distributed. For our torus example, the same distance in toroidal direction on the inside of the torus and on the outside will result in different differences for the texture coordinates, which needs to be compensated by the scaling factors.

(a) Normal map (b) Tangent map

Figure 4.4: Normal and tangent map for a torus. The horizontal axis corresponds to the toroidal direction, the vertical axis corresponds to the poloidal direction.

The process of intersecting the ray with the curved depth map is shown in Algorithm 5. In detail, the ray is traced from the intersection with the base surface iteratively in tangent space, using a constant step size. 36 4. RENDERING

Algorithm 5 Per-pixel displacement mapping for curved surfaces Input: texture coordinate p0 = (u,v,0) of the current fragment, view direction d in eye space, model-view matrix V 1: for i = 0 to MAXSTEPS do 2: (n,sb) ← V · lookup(normalmap, pi.xy) 3: (t,st ) ← V · lookup(tangentmap, pi.xy) 4: normal ← V · n, tangent ← V ·t 5: TBN ← (tangent,normal × tangent,normal) 6: S ← (s ,s ,− 1 ) t b MAXDEPTH 7: step ← S · TBNT · d 8: pi+1 ← pi + STEPSIZE · step 9: if pi+1.z < 0 then 10: discard fragment and return 11: else if pi+1.z ≥ lookup(depthmap, pi+1.xy) then 12: di = lookup(depthmap, pi.xy) − pi.z 13: di+1 = pi+1.z − lookup(depthmap, pi+1.xy) di+1 14: return pi + · STEPSIZE · S · TBN · d di+di+1 15: end if 16: end for

For this, we have to compute the tangent space matrix TBN (line 2-5): normal and tangent can be found by looking up them up in the corresponding textures and multiplying both vectors with the view matrix M; the bi-tangent can be computed as cross-product of normal and tangent vectors. The scaling factors for the tangent and bi-tangent can be read from the textures as well, the scaling vector in normal direction is − 1 simply MAXDEPTH (line 6). To find the direction for the next step of the ray in tangent space, we multiply it with the transposed tangent space matrix, scaled with the scaling factors S (line 7). Since the tangent space will be different for each point pi, this will bend the ray. Then, we can finally move along the this direction one step using the constant STEPSIZE (line 8). After each step, we test whether the ray left the base surface again, at which point the fragment is discarded, or if the ray hit the depth field. In this case, we compute the intersection of the linear interpolation of the depth field for the last 2 points pi and pi+1 with the view ray (lines 12-14). Since the step size already needs to be small to not miss any features, this gives reasonably good results, though a binary search may be used to further increase the accuracy. In all other cases, the ray is traced further up to MAXSTEPS; at this point, if still no intersection was found, the fragment is discarded as well. Similarly, shadows can be traced in tangent space, starting from the surface point found during the regular displacement mapping step, by tracing a ray in the direction towards the light source. If another intersection with the depth map is found, or the ray goes into the surface, we know that the point is shadowed. If no intersection is found, the surface point is illuminated by the light source and can be shaded accordingly.

4.3.1 Using BTFs with curved surfaces

Rendering using curved surfaces has a few implications on the preprocessing pipeline. The reason is that closed objects like the torus or a sphere do not have a border like the cuboid-shaped surface volume used for flat surfaces. In other words, the third component of the texture coordinate is always zero, which means we do not need the additional transformation derived in Section 3.2.2, but only render the top surface of the surface volume directly to a square texture for each view and light direction. Similarly, the projection is also not needed for the BTF renderer, we can use the original texture coordinate of each 4.3. CURVED SURFACES 37 fragment directly. Additionally, textures for closed objects may join at the opposite sides, as for the torus, or be tiled. In these case, one can simply repeat the depth texture and do not discard pixels in the relief mapper, for which intersections (u,v) outside of the range 0 ≤ u,v ≤ 1 are found. This means that the rendered images will not have empty border pixels. Then, the BTF images can be tiled as well. Using this approach instead of relaxed cone step relief mapping, the mixed renderer with BTF-based filtering can be used for curved surfaces as well. 38 4. RENDERING 39

5 Evaluation

The BTF construction and rendering algorithms have been implemented in C# using OpenGL via the OpenTK library. For the Levenberg-Marquardt algorithm and primary component analysis, the ALGLIB library was used. In this chapter, we want to perform some analysis to demonstrate how the method, and especially the compression, behaves for different surface models and the various parameters. All measurements where run on a computer with a quadcore CPU Intel Core i5-2400 running at 3.1GHz, 4 GB of RAM and an AMD Radeon HD 6800 GPU with 1 GB of graphics RAM.

5.1 Performance

Since one of the requirements for this method was to be able to render the surface in real-time, the performance of the renderer plays an important role. For practical applications that may use this filtering method, also the total memory size for all textures is important, since these applications typically not only use a single surface, but many different surfaces at the same time. Additionally, we evaluated the performance of all preprocessing steps, as this might affect the usage for example in content pipelines.

5.1.1 Preprocessing

Preprocessing consists of various steps: First, the relief map is computed from the depth map (1); then the relief renderer is used to render the surface volume for all view and light directions (2). Subsequently, a Laplacian pyramid is generated for each image (3), which will then be compressed using the PCA algorithm (4). The primary components and their weights are then combined into the final eigen and weight maps (5). Possibly, the mean vector µk resulting from the compression of the last level of the filter pyramid can be fitted to a Blinn-Phong model (6).

Preprocessing step run time [in s] (1) reliefmap 0.69 (2) reliefrenderer 262 (3) filter 1061 (4) pcacompress 4156 (5) combine 0.1 (6) fitmean 195

Table 5.1: Run times of the preprocessing steps

Most of these preprocessing steps take quite a long time to complete, as shown in Table 5.1. For most steps, the performance is limited by the IO operations, since a lot of data is written to and read from the hard drive. In particular, to generate the filter pyramids, the images for all view and light directions, i.e. 81 · 81 = 6561 in total, are read, and for each image, the five levels of the filter pyramid need to be stored, resulting in 32805 new files. The performance of the compression of the filter pyramids and the fitting of a reflection model to the mean vector are limited by the performance of the underlying algorithms: primary component analysis 40 5. EVALUATION and the Levenberg-Marquardt algorithm. Similarly, performance of the computation of the relief map from a depth map is limited by the CPU as well. Since the relief map generation is inherently parallel and PCA needs to be performed many times (once for each level of the filter pyramids and each color channel Y, Cb and Cr), both have been optimized to execute in multiple threads. Further optimizations, especially regarding the file handling, have not been applied. By integrating all steps into a single program and holding all intermediate data in memory, we could probably achieve huge performance improvements, but on the other hand the preprocessing pipeline will lose some of its flexibility.

5.1.2 Rendering

For rendering, the performance is promising. Even with full interpolation of mean vector and all levels of the filter pyramid and when using 16/4/4 components for the Y/Cb/Cr channels, the renderer can draw a surface with LoD parameter lod = 0 at a resolution of 1024 × 768 at 32 frames per second. For 8/2/2 components and when only interpolating the mean instead of also all filter levels, the performance increases to 331 frames per second as shown in Table 5.2.

interpolation type 16/4/4 8/2/2 full interpolation 32 61 interpolate mean only 201 292 use fitted mean model 215 331

Table 5.2: Performance in frames per second for rendering surfaces compressed with 16/4/4 and 8/2/2 components for different types of interpolation

We found that the performance does not depend on the type of surface, i.e. it was basically identical for all tested surface models. Besides the number of compressed components for the Y, Cb and Cr channels and the interpolation type, the performance was approximately proportional to the number of pixels, which where rendered using the BTF. Additionally, higher LoD parameters result in fewer pyramid level being decompressed, which increases performance as well. In practice, most scenes will probably show better performance than given here, since most of the time many objects in the scene will either be rendered using relief mapping if they are close to the camera or use various LoD levels depending of the distance to the camera. Scenes where all objects are rendered with lod = 0 should be quite rare.

Texture sizes The renderer uses a lot of textures. The relief mapping needs a relief map and textures for the ambient, diffuse and specular color of the surface. The BTF renderer additionally needs the compressed eigen map and weight map, as well as the mean vector. Finally, the projection texture and the index map are used to improve the performance. For 16/4/4 compressed components and a texture size of 512 × 512 for the relief, color and index maps, this yields a total size of roughly 10.6MiB (see Table 5.3). This means that it is possible to store textures for many different surfaces, which typically appear in complex scenes, in the graphics memory of current GPUs. The uncompressed BTF, on the other hand, without including all other needed textures, has a size of at least 76.9 MiB, limiting its usage to only a few surfaces per scene. 5.2. LEVEL OF DETAIL 41

Texture Channels Format Resolution Size [in byte] relief map 4 byte e.g. 512 × 512 1048576 diffuse, specular and ambient texture 4 (RGBA) byte e.g. 512 × 512 1048576 eigen map 4 float 405 × 486 3149280 weight map 4 float 256 × 128 524288 mean vector 3 (YCbCr) float 81 × 81 78732 index map 3 float e.g. 512 × 512 3145728 projection map 3 float 2 × 81 1944 Total 11094276

Table 5.3: Sizes of all textures needed for the mixed relief/BTF renderer

5.2 Level of detail

The overall visual quality of the LoD method introduced in this thesis has been examined for various test scenes. Figure 5.1 shows five interesting surfaces with different properties, that will influence the overall quality: Stones This surface shows some cobblestone with quite large displacements. The stones are colored red, the gaps between the stones green, which means that for grazing view or light angles, most of the green will be occluded or in shadow, whereas for view and light direction closer to normal incidence, all the green gaps will be visible and illuminated, resulting in significant color differences. British Isles This surface shows the topography of the British isles. This surface features specular, flat water and diffuse land. While the water reflects light mostly in mirror direction, the land exhibits more back-reflection because of its roughness. Objects This surface shows some objects protruding from a flat plane with nearly no difference in reflections properties across the surface, except for a slight wood texture. However, it is easy to recognize the objects by the effects of occlusion and shadowing, which yield hard silhouettes and shadows. Text This surface shows some text displaced from a flat plane. While the text itself has a different color, the rest of the surface is totally plain. Anisotropic The geometry of this surface consists of parallel, long V-shaped trenches (also called V- cavities). The overall surface exhibits highly anisotropic reflection properties similar to brushed metal, even so reflection of each point itself is isotropic. Figure 5.1 shows these surfaces rendered for different view and light directions both by the relief renderer and the BTF renderer. One can see, that the images rendered using the relief renderer are subject to aliasing, especially the anisotropic surface with its small V-cavities, where distinct Moiré patterns can be seen. The images rendered using the BTF renderer do not show this problem, since they use the prefiltered data. It can also be seen that compression reduces many of the highlights, especially for the Stones surface. However, overall the appearance is quite similar to the that of the relief renderings. For example, the effects of occlusion and shadowing are preserved, as can be seen for the Objects surface: the shadows are clearly visible in the same direction as in the images produced by the relief renderer, even though the fine details are blurred. For the Stones surface, the colors are occluded correctly. The reflectance of the anisotropic surface is preserved perfectly. For the BTF-renderer, this is actually an easy case, since the downsampled images of the Laplacian pyramid contain nearly no variation: The whole surface looks uniform for a given view and light direction when filtered to small resolutions. 42 5. EVALUATION

Relief:

BTF:

(a) Stones

Relief:

BTF:

(b) British Isles

Relief:

BTF:

(c) Objects

Relief:

BTF:

(d) Text

Relief:

BTF:

(e) Anisotropic

Figure 5.1: Different surfaces used for the evaluation, all rendered for different view and light directions both by the relief renderer and the BTF renderer 5.3. BTF COMPRESSION 43

Figure 5.2: Transition from the relief renderer to the BTF renderer for various LoD levels when moving aways from the surface

The smooth transition from the relief renderer to the filtered BTF renderer, when moving further and further away from the surface, is shown in Figure 5.2. This transition blends between the relief renderer and the BTF in the first few images, then the image is solely rendered using the filtered BTF at increasing LoD levels, i.e. decreasing number of levels of the filter pyramid. The projection used to transform the texture coordinates makes sure that the BTF renderer maps the BTF correctly to the surface volume, and the result is consistent to the images rendered using displacement mapping.

5.3 BTF compression

The compression method has a large influence on the quality of the images produced from the BTF renderer. While we mostly follow the compression method suggested by Ma et al. in [MCT+05], there are many parameters that can be tweaked—especially since our requirements are slightly different, having artificially rendered surfaces with possibly large displacement values compared to the mostly flat, measured surfaces in their work. In this section, we investigate the influence of these parameters on the rendering quality.

Number of components Each level of the filter pyramids is compressed using PCA, where the most dominating principal compo- nents are stored in the eigen map. Together with the weights computed from these principal components, and the mean vector µk of the last level of the filter pyramid, these components are then used to reconstruct the reflectance of a surface point. The best choice for the number of principal components for each component Y/Cb/Cr of the color space is not simple: the more principal components are used for reconstruction, the better the result will look. However, more components also mean larger texture sizes and more expensive reconstruction. In Table 5.4 we measured the mean square error of the reconstructed images for all view and light directions compared to the original, uncompressed images for all five surfaces types and different numbers of components for reconstruction. The reconstruction error in Y channel is always higher than the one for the Cb and Cr channels, even though more components are used for Y then for the other channels in all cases. The reason for this is that the reflection models primarily change the brightness of a pixel, but the chromaticity changes only to a 44 5. EVALUATION

Surface 4/1/1 8/2/2 12/3/3 16/4/4 20/8/8 Y 0.00303 0.00279 0.00260 0.00246 0.00234 Stones Cb 0.00043 0.00040 0.00038 0.00037 0.00033 Cr 0.00190 0.00143 0.00123 0.00113 0.00089 Y 0.00027 0.00018 0.00014 0.00011 0.00009 British Isles Cb 0.00014 0.00010 0.00008 0.00006 0.00003 Cr 0.00004 0.00003 0.00003 0.00002 0.00001 Y 0.00176 0.00134 0.00110 0.00096 0.00086 Objects Cb 0.00010 0.00008 0.00006 0.00006 0.00004 Cr 0.00006 0.00004 0.00004 0.00003 0.00002 Y 0.00223 0.00164 0.00134 0.00117 0.00105 Text Cb 0.00000 0.00000 0.00000 0.00000 0.00000 Cr 0.00000 0.00000 0.00000 0.00000 0.00000 Y 0.00017 0.00010 0.00006 0.00004 0.00003 Anisotropic Cb 0.00000 0.00000 0.00000 0.00000 0.00000 Cr 0.00000 0.00000 0.00000 0.00000 0.00000

Table 5.4: Mean squared error of the reconstructed Y, Cb and Cr channel for various surfaces encoded with different numbers of principal components per channel lesser extent. Together with the fact that the human eye is more sensitive to brightness then to color, this means that it is indeed a good idea to use less components for the encoding of the Cb and Cr channels than for the Y channel. The surfaces Text and Anisotropic are a special case, where no reconstruction error occurs in the Cb and Cr channels, since both are grayscale surfaces. In these cases, we wouldn’t need to store any components for the Cb and Cr channels. Also, the reconstruction quality varies greatly between the different surface types: While the Stones, Objects and Text surfaces yield high reconstruction errors, the British Isles and Anisotropic surfaces are much better compressible. This might be because British Isles and Anisotropic have quite small displacements and the reflectance varies more smoothly compared to the other surfaces (the fine details of the Anisotropic surface are not visible at those resolutions, that the BTF can encode). Overall, we found that the number of 16/4/4 components give a reasonable compromise between per- formance and quality, though for the British Isles and Anisotropic surfaces one might want to use fewer components. Therefore, the BTF renderer supports both 16/4/4 and 8/2/2 components for Y/Cb/Cr and one can choose which one is a better fit for a given surface.

5.3.1 Border pixels

As described in Section 3.4, we need to pay attention to the way empty border pixels are handled, i.e. pixels for which no intersection with the depth map was found during relief mapping. While these border pixels will not be rendered, since relief mapping discards these pixels also for the BTF renderer, they will influence the compression. We measured the mean squared reconstruction error for the Stones surface compressed with the different border handling approaches, this times measuring the difference between the reconstructed image con- verted to RGB and the original image; the resulting errors for the reconstruction with both 16/4/4 and 8/2/2 components are shown in Figure 5.3. 5.3. BTF COMPRESSION 45

MSE 0.00497 0.005 0.00429 0.004 0.0039 0.00388 0.00389 0.00389 0.00331 0.0033 0.00329 0.00329 0.003 8/2/2 components 16/4/4 components 0.002

0.001

Black 2PCA 3PCA Average NoBorder

Figure 5.3: Mean squared error (MSE) of the reconstructed images for the Stones surface compressed using different methods to handle empty border pixels

The tested border handling approaches are the following: NoBorder This method completely ignores the border: any empty pixel is treated as a black pixel. Filtering and compression therefore don’t have to handle the case of empty border pixels. Black Filtering takes into account empty border pixels and the box filter only adds those pixels that are not empty. If all pixels to be averaged are empty, the result is an empty pixel is well. After filtering, empty pixels in all Laplacian images and the last Gaussian image are set to Y=0, Cb=0, Cr=0. Average This works like the method Black, but instead of setting empty border pixels to black in each image, it sets these pixels to the average color of all non-empty pixels in the image. Since the Laplacian images are difference images, the average should also be Y=0, Cb=0, Cr=0. However, the Gaussian images at the last filter pyramid level may have non-zero averages; this is where this approach is different from the method Black. 2PCA & 3PCA Applies PCA iteratively to the result of method Average as described in Section 3.4. In total, two respectively three PCA passes are done for 2PCA and 3PCA. As expected, the first approach, simply ignoring the border pixels, results in considerably higher errors. However, the other approaches give basically the same quality: Multiple PCA passes or setting the empty pixels to the average of each image of the filter pyramid does not seem to improve the compression compared to setting these pixels to black. Because of the high run time of the PCA algorithm, multiple PCA passes are therefore actually quite inefficient.

5.3.2 Mean encoding

Another parameter that influences the quality of the reconstructed images, is the encoding of the mean vector µk of the last level of the filter pyramids. The mean is an ABRDF and can either be stored directly as a vector, or fitted to a reflection model as proposed in [MCT+05]. Figure 5.4 shows the mean vector and the BRDF resulting from fitting a Blinn-Phong model to the mean as described in Section 3.3.2, for three different surfaces types. The figure shows that the mean vector and the fitted model can differ significantly. For example, the Stones surface with its large displacements exhibits significant back-reflection, but very little reflection in mirror direction, since the depth map has no flat regions. However, the Blinn-Phong model can encode only 46 5. EVALUATION

Mean vector µk

Fitted model

(a) Stones (b) British Isles (c) Anisotropic

Figure 5.4: The mean vector µk and the fitted Blinn-Phong model for different surfaces specular highlights in mirror direction, and consequently fails for this case1. Additionally, the encoding of the colors very poor—the greenish color for view and light direction at normal incidence and the red appearance for grazing angles are totally lost. The total mean squared error for this surface when reconstructing the images with 16/4/4 primary components therefore increases from 0.003297 when using the mean vector directly to 0.004422 when using the fitted model. For the anisotropic surface, the case is very similar: the Blinn-Phong model is not able to represent anisotropic materials, hence using the fitted model results in large errors. However, the fitting works quite well for the British Isles scene, since most of the surface consists of flat, specular ocean. But even in this case, the fitting is not perfect: for example the hills exhibit slight back-reflection, which is lost, and the highlights are slightly blurred. In general, especially for very rough surfaces, approximating the mean by fitting a Blinn-Phong model seems to be a bad idea. While it takes less memory to store and can be evaluated for any view and light direction directly, it introduces too large errors in most cases. This problem might be solved by using different reflection models, which are able to handle explicit back-reflection and possibly anisotropy. However, the direct use of the mean vector seems to be preferable.

5.4 Interpolation

Interpolation of the BTF is important to reduce aliasing and other effects such sudden changes of surface reflectance when a direction index changes. The weight map can be interpolated directly by the GPU when a linear texture filter for the weight texture is enabled. However, the view and light directions, encoded in the eigen map, cannot be interpolated that way, since the direction indices are not linear, but actually correspond to two-dimensional directions. Instead, the directions can be interpolated by evaluating the mean and possibly the filter pyramid levels for the three nearest view and light directions and weighting the results correspondingly, as described in Section 4.1.4. This results in a total of nine different direction index pairs, for which the BTF has to be decompressed. In case the mean vector was fitted using a Blinn-Phong model, this interpolation is unnecessary, since the model can be evaluated directly for the exact view and light directions.

1 Interestingly, the fitted model tries to imitate the back-reflection by using negative values for the specular color ks; however, this works only partially. 5.4. INTERPOLATION 47

Reference image

Full interpolation

Interpolation of mean only

(a) Stones

Reference image

Full interpolation

Interpolation of mean only

Evaluation of the mean model

(b) British Isles

Figure 5.5: Interpolated rendering of the Stones and British Isles surfaces for different view and light directions, showing the reference image rendered with the relief renderer, and interpolated BTF images using different interpolation strategies

Figure 5.5 show the results of interpolated rendering for the Stones and British Isles surfaces for different view and light directions. The first row shows the reference image produced by the relief renderer, while the second and third row show the results from the BTF renderer using full interpolation of the mean and the filter pyramid levels or interpolating only the mean vector. For the British Isles, the fourth row shows the result for a fitted mean model—since the model fitting for the Stones scene introduces too large errors, we do not show the results for a fitted mean model for this case. On the first sight all interpolated results look quite similar, however, a few differences can be noted. At 48 5. EVALUATION points, where the nearest view or light direction changes, one can see artifacts for the methods where only the mean is interpolated. Figure 5.6 shows a close up of one of these cases, where the artifacts are clearly visible. While these artifacts may not stand out that much in these still images, when moving around the surface they are much more apparent. Of course, the full interpolation of all filter levels remove this kind of artifact, resulting in a surface that looks smooth everywhere.

(a) Full interpolation of the mean and all filter levels (b) Interpolation of the mean only

Figure 5.6: Difference of full interpolation and the interpolation of the mean only for the Stones surface

For surfaces where the mean vector can be fitted to a model without resulting in too large errors, the mean model can be used as well - and achieves similar results. For the British Isles scene, the results of interpolating the mean vector and evaluating the fitted model are barely distinguishable.

While yielding slightly better quality, the full interpolation of the mean vector µk and all filter pyramid levels results in a huge performance loss, since the whole BTF must be decompressed nine times for different view and light direction indices instead of only once (see Table 5.2). Using a fitted mean model instead of interpolating the mean vector increases the performance slightly.

5.5 Curved surfaces

We evaluated the normal-based curved silhouette rendering algorithm using a torus as base surface. The torus has many interesting properties like different curvatures for its surface points. The displacement mapped surface looks quite good overall, although a slight distortion occurs near its silhouette. This is probably due to two reasons: Firstly, the tangent space is only evaluated at discrete points, although it varies smoothly across the surface. Since the step size must be small to not miss any features, this error is quite small as well, but propagates with each step. The second reason for the distortion is that the scaling factors are only valid for the surface points on the base mesh. Once the ray has been traced deeper into the surface, the scaling factors would need to be adapted as well, but this is not done in normal-based curved silhouette rendering. Again, the error due to this adds up with each step. Figure 5.7 shows the combination of the displacement renderer for curved surfaces with the BTF renderer. The renderer will smoothly transition from explicit evaluation of the reflection models to the filtered BTF data, but will always compute the correct silhouette. The figure clearly shows that this transition is not consistent, and that the BTF renderer produces considerably different results. This has several reasons. The BTF was created by the relief mapper by tracing rays into a flat surface volume. The curved surface renderer however may find different intersection points with the depth map for the same texture coordinate and view direction, especially for large curvatures or grazing angles. This discrepancy between the intersection points found during BTF construction and the intersection points found while tracing the curved surface results in noticeable errors. Another reason is that the tangent space of the original point on the base surface and the tangent space of the intersection point with the depth map may be quite different. One of those has to be used to compute the view and light direction indices for the BTF look-up, but both alternatives are not optimal 5.5. CURVED SURFACES 49

Figure 5.7: Transition from normal-based curved silhouette rendering to the BTF renderer for a torus when moving aways from the object and introduce additional problems. If the tangent space of the point on the base surface is used, this will looks as if the point is illuminated from the wrong direction, resulting in strange illumination effects that are quite apparent in Figure 5.7. However, using the tangent space of the exact intersection point to transform the view and light direction is not good either, since the BTF is actually looked up at the texture coordinate of the point on the base surface and not the exact intersection point, amplifying the problem with the discrepancy of the intersection points for the curved surface and the flat surface. Additionally, using the tangent space for π intersection points close to the silhouette may result in view directions (θr,φr) with θr > 2 , for which the BTF is not defined. In conclusion, filtered rendering using BTFs should not be directly applied to curved surfaces, as this will give incorrect results. To adopt this approach, the BTF needs to store the images not only for each view and light direction, but also for different curvatures. However, this would add even more dimensions to the BTF, which introduces additional difficulties for nearly all steps of preprocessing and rendering. 50 5. EVALUATION 51

6 Conclusion

The goal of this thesis was to design a method that can filter and render rough surface with arbitrary displacements and variable reflection properties in real-time and with the best possible quality. So far, we do not know of any other method that can filter such surfaces correctly. By generating a BTF, which contains the surface rendered for many different view and light directions, a correct filtering technique for these arbitrary surfaces is however possible. For this these, we designed and implemented a preprocessing pipeline, which generates and filters such a BTF from only a few textures, like a depth map and color textures describing the reflectance properties. The preprocessing uses per-pixel displacement mapping and several reflection model to render the surface for each view and light direction, which are rectified and than filtered using Laplacian pyramids, which segment the images into different frequency bands, and can be used for a level of detail reconstruction of the surface. However, the raw BTF is too large to use it effectively in practice. Therefore, the design and implemen- tation of a reasonable compression method was another important task in this thesis. The compression method is based on the primary component analysis already and was already discussed by other authors, however, we found many possibilities to improve the quality of the reconstructed result. By combining the BTF renderer with the relief renderer, the final rendering algorithm renders surfaces in high quality for a large range of resolutions: For small resolutions, where filtering is needed to minimize aliasing, the BTF renderer is used, whereas for large resolutions the relief renderer achieves higher quality. All algorithms for filtering and rendering where implemented in a prototype. Using this implementation we tested how the various optimizations and parameters affected the quality and performance of the renderer in practice. Overall, the filtering approach for rough surfaces using BTFs shows promising results, though the quality of the BTF renderer depends considerably on the type of surface. However, we found that the BTF renderer can not be used for curved base surfaces, as the results are inconsistent with the ones produced by evaluating the reflection models directly for each intersection point found by the per-pixel displacement mapper.

Future work Since many real-world objects use curved base surfaces, it would be particularly useful to modify the BTF-based filtering approach to handle curved surfaces as well. As explained before, one way to do this might be to extend the BTF by new dimensions for the curvature. However, this way both the construction of the BTF and the compression will probably be more complicated. Another useful feature would be the use of sky maps to illuminate the surface not only by a single light source, but by the whole environment. For this, it might be possible to fold the BTF data with the sky map in such a way, that the BTF renderer can decompress the reflected light for any surface point from the folded data using the view direction and the normal vector of the base surface instead of view and light directions as in our method. Also, it would be interesting to test how other BTF compression methods behave on the synthetic BTFs constructed in this thesis—especially since the the PCA method often reduces sharp highlights. One might find that other compression methods are better suited for this kind of synthetic BTFs. Finally, further testing of the BTF-based filtering method for large scenes with different rough surfaces is needed to verify that this approach indeed works universally. 52 6. CONCLUSION 53

Bibliography

[Bli77] James F. Blinn. Models of light reflection for computer synthesized pictures. In Proceedings of the 4th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’77, pages 192–198. ACM, 1977. [Bli78] James F. Blinn. Simulation of wrinkled surfaces. In Proceedings of the 5th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’78, pages 286–292. ACM, 1978. [BM93] Barry G. Becker and Nelson L. Max. Smooth transitions between bump rendering algo- rithms. In Proceedings of the 20th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’93, pages 183–190. ACM, 1993. [BT04] Zoe Brawley and Natalya Tatarchuk. Parallax occlusion mapping: Self-shadowing, perspective-correct bump mapping using reverse height map tracing. In Wolfgang En- gel, editor, ShaderX3: Advanced Rendering with DirectX and OpenGL, pages 135–154. Charles River Media, 2004. [CC08] Ying-Chieh Chen and Chun-Fa Chang. A prism-free method for silhouette rendering in inverse displacement mapping. Computer Graphics Forum, 27(7):1929–1936, 2008. [Coo84] Robert L. Cook. Shade trees. In Proceedings of the 11th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’84, pages 223–231. ACM, 1984. [CT82] R. L. Cook and K. E. Torrance. A reflectance model for computer graphics. ACM Trans. Graph., 1(1):7–24, January 1982. [DNGK97] Kristin J. Dana, Shree K. Nayar, Bram Van Ginneken, and Jan J. Koenderink. Reflectance and texture of real-world surfaces authors. In Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR ’97), CVPR ’97, pages 151–157. IEEE Computer Society, 1997. [Don05] William Donnelly. Per-pixel displacement mapping with distance functions. In GPU Gems 2, pages 123–136. Addison-Wesley, 2005. [Dum06] Jonathan Dummer. Cone step mapping: An iterative ray-heightfield intersection al- gorithm. available at http://www.lonesock.net/files/ConeStepMapping. pdf, 2006. [FH09] Jirí Filip and Michal Haindl. Bidirectional texture function modeling: A state of the art survey. IEEE Trans. Pattern Anal. Mach. Intell., 31(11):1921–1940, 2009. [HEGD04] Johannes Hirche, Alexander Ehlert, Stefan Guthe, and Michael Doggett. Hardware acceler- ated per-pixel displacement mapping. In Proceedings of Graphics Interface 2004, GI ’04, pages 153–158. Canadian Human-Computer Communications Society, 2004. [IR10] Alexander Ilin and Tapani Raiko. Practical approaches to principal component analysis in the presence of missing values. Journal of Machine Learning Research, 11:1957–2000, 2010. [Kaj85] James T. Kajiya. Anisotropic reflection models. In Proceedings of the 12th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’85, pages 15–21. ACM, 1985. 54 Bibliography

[KE09] Murat Kurt and Dave Edwards. A survey of brdf models for computer graphics. SIGGRAPH Computer Graphics, 43(2):1–7, May 2009. [KMBK03] Melissa L. Koudelka, Sebastian Magda, Peter N. Belhumeur, and David J. Kriegman. Acqui- sition, compression, and synthesis of bidirectional texture functions. In In ICCV 03 Workshop on Texture Analysis and Synthesis, 2003. [MCT+05] Wan-Chun Ma, Sung-Hsiang Chao, Yu-Ting Tseng, Yung-Yu Chuang, Chun-Fa Chang, Bing-Yu Chen, and Ming Ouhyoung. Level-of-detail representation of bidirectional texture functions for real-time rendering. In Proceedings of the 2005 symposium on Interactive 3D graphics and games, I3D ’05, pages 187–194, New York, NY, USA, 2005. ACM. [MM] Morgan McGuire and Max McGuire. Steep parallax mapping. I3D 2005 Poster. [MMK03] Gero Müller, Jan Meseth, and Reinhard Klein. Compression and real-time rendering of measured btfs using local pca. In Vision, Modeling and Visualisation 2003, pages 271–280. Akademische Verlagsgesellschaft Aka GmbH, Berlin, November 2003. [OKL06] Kyoungsu Oh, Hyunwoo Ki, and Cheol-Hi Lee. Pyramidal displacement mapping: a gpu based artifacts-free ray tracing through an image pyramid. In Proceedings of the ACM symposium on Virtual reality software and technology, VRST ’06, pages 75–82. ACM, 2006. [ON94] Michael Oren and Shree K. Nayar. Generalization of lambert’s reflectance model. In In SIGGRAPH 94, pages 239–246. ACM Press, 1994. [OP05] Manuel M. Oliveira and Fabio Policarpo. An efficient representation for surface details. Technical Report RP-351, Instituto de Informática, UFRGS, 2005. [PO08] Fabio Policarpo and Manuel M. Oliveira. Relaxed cone stepping for relief mapping. In Hubert Nguyen, editor, GPU Gems 3, pages 409–428. Addison-Wesley, 2008. [POC05] Fábio Policarpo, Manuel M. Oliveira, and João L. D. Comba. Real-time relief mapping on arbitrary polygonal surfaces. In Proceedings of the 2005 symposium on Interactive 3D graphics and games, I3D ’05, pages 155–162. ACM, 2005. [RSP05] Eric A. Risser, Musawir A. Shah, and Sumanta Pattanaik. Interval mapping. Technical report, 2005. [Sch94] Christophe Schlick. An inexpensive brdf model for physically-based rendering. Computer Graphics Forum, 13:233–246, 1994. [SKU08] László Szirmay-Kalos and Tamás Umenhoffer. Displacement mapping on the GPU - State of the Art. Computer Graphics Forum, 27(1), 2008. [SSK03] Mirko Sattler, Ralf Sarlette, and Reinhard Klein. Efficient and realistic visualization of cloth. In Eurographics Symposium on Rendering 2003, June 2003. [TLQ+05] Ping Tan, Stephen Lin, Long Quan, Baining Guo, and Heung-Yeung Shum. Multiresolu- tion reflectance filtering. In Oliver Deussen, Alexander Keller, Kavita Bala, Philip Dutré, Dieter W. Fellner, and Stephen N. Spencer, editors, Rendering Techniques, pages 111–116. Eurographics Association, 2005. [TLQ+08] Ping Tan, Stephen Lin, Long Quan, Baining Guo, and Harry Shum. Filtering and rendering of resolution-dependent reflectance models. IEEE Transactions on Visualization and Computer Graphics, 14(2):412–425, 2008. [Wil83] Lance Williams. Pyramidal parametrics. In Proceedings of the 10th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’83, 1983. List of Figures 55

List of Figures

2.1 Per-vertex displacement mapping tessellating the surface and displacing the vertices . . .8 2.2 Per-pixel displacement mapping tracing the view ray to find the intersection with the height map ...... 8 2.3 Per-pixel displacement mapping using the cone stepping distance function ...... 9 2.4 Per-pixel displacement mapping for curved surfaces using curved rays ...... 10 2.5 A BRDF, showing the reflected radiance of a surface with normal n for light coming from direction ωi ...... 11 2.6 Comparison of diffuse reflection models for different angles of incident light ...... 12 2.7 Comparison of specular reflection models on a diffuse sphere for different angles of incident light ...... 13 2.8 Two different BTF representations of a BTF ...... 17

3.1 Steps of the preprocessing pipeline ...... 19 3.2 View and light directions used in the BTF representation ...... 20 3.3 Per-pixel displacement mapping using relaxed cone steps ...... 22 3.4 Artifacts of the original algorithm for computing relaxed cone step maps ...... 23 3.5 Transformation that rectifies the rendered images ...... 24 3.6 Transformation by the orthogonal projection for view direction (θr,φr) ...... 25 3.7 Derivation of the projection matrix for BTF rendering (1) ...... 25 3.8 Derivation of the projection matrix for BTF rendering (2) ...... 25 3.9 Layout for the textures containing the eigen ABRDFs ek j and the weights dk j ...... 28

4.1 Texture to look up the nearest direction index for direction ω = (θ,φ) ...... 31 4.2 Rendered images for different LoD parameters ...... 33 4.3 Index map, encoding the nearest three directions and their weights ...... 34 4.4 Normal and tangent map for a torus ...... 35

5.1 Different surfaces used for the evaluation, rendered both by the relief renderer and the BTF renderer ...... 42 5.2 Transition from the relief renderer to the BTF renderer for various LoD levels when moving aways from the surface ...... 43 5.3 Mean squared error of the reconstructed images for the Stones surface compressed using different methods to handle empty border pixels ...... 45 5.4 The mean vector µk and the fitted Blinn-Phong model for different surfaces ...... 46 5.5 Interpolated rendering of the Stones and British Isles surfaces using different interpolation strategies ...... 47 5.6 Difference of full interpolation and the interpolation of the mean only for the Stones surface 48 5.7 Transition from normal-based curved silhouette rendering to the BTF renderer for a torus when moving aways from the object ...... 49 56 List of Figures List of Tables 57

List of Tables

5.1 Run times of the preprocessing steps ...... 39 5.2 Performance in frames per second for rendering surfaces compressed with 16/4/4 and 8/2/2 components for different types of interpolation ...... 40 5.3 Sizes of all textures needed for the mixed relief/BTF renderer ...... 41 5.4 Mean squared error of the reconstructed Y, Cb and Cr channel for various surfaces encoded with different numbers of principal components per channel ...... 44 58 List of Tables 59

Acknowledgments

I would like to thank Prof. Gumhold and Andreas Stahl for their comprehensive support and supervision during the development of this thesis and many enlightening discussions. I would also like to thank my family and Pauline Churavy for their patience and encouragement and Jonas Ecke and Stefan Polenz for their editorial help. 60 Acknowledgments