2007:12 HIP BACHELOR'S THESIS

Implicit Procedural Textures as a means of saving texture memory

Joakim Lindqvist

Luleå University of Technology BSc Programmes in Engineering BSc programme in Computer Engineering Department of Skellefteå Campus Division of Leisure and Entertainment

2007:12 HIP - ISSN: 1404-5494 - ISRN: LTU-HIP-EX--07/12--SE Implicit Procedural Textures as a means of saving texture memory Page | I Joakim Lindqvist LTU Skellefteå 6/4/07

Abstract This thesis explores the possibility of exchanging regular textures with procedural textures. It focuses on classic procedural textures like different types of rock and sand. Most time is spent on implicit textures generation on a GPU, this means a limit to the types of algorithms that can be used. As such, much of the work focuses on implementations. The goal is to lower video memory usage by changing regular sampled textures with procedural ones. In the end a few ways of creating textures with a satisfactory level of detail is suggested using a blend of regular textures and procedural ones.

Sammanfattning Detta examensarbete undersöker möjligheterna att byta ut vanliga texturer mot procedurella texturer. Den fokuserar på klassiska procedurella texturer som exempelvis sten och sand. Jag jobbar mest med implicita texturer körda på GPUn vilket medför en del begränsningar i vilka algoritmer som kan köras. På grund av detta ligger väldigt mycket fokus på olika noise implementationer. Målet är att minska texturminnet som används igenom att ändra vanliga texturer med procedurella motsvarigheter. I slutändan presenterar jag några sätt att skapa texturer med tillräckliga detaljer där vi använder en blandning av vanliga texturer och procedurella. Page | II Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

Preface This thesis was written at Avalanche Studios with the purpose of investigating how procedural texture could be used to save memory, bandwidth and sample usage in their graphics engine by replacing regular high resolution textures with procedural counterparts. The main focus was the textures used in the terrain engine since they are classic examples of textures easily created procedurally I would like to thank everyone at Avalanche for being so nice and making me feel welcome especially my mentor Gustav Taxén.

Joakim Lindqvist ([email protected]), Stockholm, 4. Jun. 2007 Implicit Procedural Textures as a means of saving texture memory Page | III Joakim Lindqvist LTU Skellefteå 6/4/07

Table of Contents ABSTRACT ...... I SAMMANFATTNING ...... I PREFACE ...... II INTRODUCTION ...... 1

BACKGROUND ...... 1 THESIS O UTLINE ...... 1 THEORY ...... 2

P ROCEDURAL TEXTURING ...... 2 TOOLS ...... 3 S HADERS ...... 4 S HADING ...... 4 P ATTERN ...... 4 N OISE ...... 5 Noise Terminology...... 5 Value and ...... 5 ...... 6 Improved Perlin Noise...... 6 ...... 7 mNoise...... 7 Fractal Sums ...... 8 Aliasing...... 9 IMPLEMENTATION ...... 10

IMPLICIT AND E XPLICIT TEXTURING ...... 10 THE ...... 10 THE TEXTURE ...... 10 C OMPILERS ...... 11 P ERMUTATION TABLE ...... 11 D ECALS ...... 12 F RACTAL S UMS ...... 12 C REATING A COPY OF A REAL TEXTURE ...... 12 ALIASING ...... 13 B LENDING ...... 13 E XPLICIT NORMAL M AP...... 13 RESULTS ...... 14 DISCUS SION ...... 16

C ONCLUSION ...... 16 Shaders...... 16 Tools...... 16 Higher dimension noise...... 16 Drawbacks...... 17 Other advantages of implicit procedural textures...... 17 How to create the 2D textures we want...... 17 F URTHER WORK ...... 18 REFERENCE S ...... 19 Page | IV Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

Illustration Index

ILLUSTRATION 1: PERLIN NOISE ...... 2 ILLUSTRATION 2 : FX COMPOSER ...... 3 ILLUSTRATION 3 : SMOOTHSTEP FUNCTION [1]...... 5 ILLUSTRATION 4: PROCEDURAL BRICK TEXTURE [1]...... 5 ILLUSTRATION 5 : IMPROVED PERLIN NOISE ...... 6 ILLUSTRATION 6 : SIMPLEX NOISE ...... 7 ILLUSTRATION 7 : M = 1...... 8 ILLUSTRATION 8 : M = 61...... 8 ILLUSTRATION 9 : M = 151...... 8 ILLUSTRATION 10 : 1/F FBM ...... 8 ILLUSTRATION 11 : TURBULENCE ...... 8 ILLUSTRATION 12 : THE MATERIAL ATLAS TEXTURE ...... 11 ILLUSTRATION 13 : PROCEDURAL TEXTURE WITH DECALS ...... 12 ILLUSTRATION 14 : PROCEDURAL TEXTURE WITHOUT FILTERING ...... 13 ILLUSTRATION 15 : PROCEDURAL TEXTURE WITH FILTERING ...... 13 ILLUSTRATION 16: ORIGINAL SCREENSHOT ...... 14 ILLUSTRATION 17: PROCEDURAL TEXTURE WITH SAMPLED LOW RESOLUTION HEIGHT MAP ...... 15 ILLUSTRATION 18: PROCEDURAL TEXTURE WITH SAMPLED HIGH RESOLUTION HEIGHT MAP ...... 15 Implicit Procedural Textures as a means of saving texture memory Page | 1 Joakim Lindqvist LTU Skellefteå 6/4/07

Introduction

Background Graphics hardware have been growing performance wise a lot the last couple of years giving programmers more calculation power but also more video memory, sampler units and faster bandwidth. We now seem to be moving toward a focus more on processing power and less on video memory and bandwidth increases. To fully utilize this, more procedural methods needs to be used and judging from the game development conferences this last year procedural techniques seems to be what everyone is doing. A way of better utilizing the limited texture capacity would be to calculate the textures directly on the GPU (procedural texturing), techniques that has been used in ray tracers(through renderman shaders) for quite some time[1].

My research questions during this thesis work were: ● Can procedural textures be used to replace regular textures as a means of saving video memory and samplers at interactive speeds? ● What are the drawbacks of procedural textures, if any?

Thesis Outline I start with some theory behind shaders, procedural textures and the noise algorithms I used. That is followed by some information on my implementations of the textures. Because of the nature of writing shaders and procedural textures i.e. the code being very non- general. I have chosen not to write that much about it but more on the techniques I used. Most of the code was created using a simple trial and error approach since what looks good in a texture is kind of hard to theoretically calculate. After this I present a few screenshots of the different shaders I have written during this thesis. Lastly I conclude some ways of doing procedural textures that I feel give a good result. Page | 2 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

Theory

Procedural texturing There are two types of procedural textures [1], Implicit and explicit. Explicit textures are created using a set of procedural instructions but are then saved in memory (or in a file). The advantages of using explicit procedural textures instead of regular textures are the lowered size of the texture for distribution and storage of the application. In general a set of instructions on how to create a texture takes less storage space then the final image. Explicit textures can also be recalculated (given a sufficiently fast implementation) to be altered according to changes in the simulation, for instance you could, like suggest by Lefebvre and Neyret[2], add decals on to the texture instead of creating a new 3D surface for the decal. Although the biggest advantages of procedural texturing comes when you use implicit textures. An implicit texture is never saved to memory; instead the procedural instructions for the texture are executed every time you need it evaluated. In today’s games and similar applications that means executed as shader code on the GPU. Advantages of implicit procedural textures are the obvious wins in lowered video memory, saved bandwidth and texture samplers. Other advantages not so obvious but even better are the resolution independence and size independence making procedural textures great for really big meshes viewed from both near and far distance.

Illustration 1: Perlin Noise

Procedural textures are built using two major building blocks, shading and pattern[1]. All textures are just a combination of shading (color) and pattern. The most common patterns are noise and cellular patterns together with simple mathematic patterns (like for instance a brick texture created using a step function). Noise is a pattern built to mimic (it was first mentioned by Ken Perlin in An Image Synthesizer paper [3] from 1985) but it’s pseudo-random (i.e. Deterministically random, given a set of inputs you always get the same “random“ output) instead of the complete randomness that is white noise. This is because we want it to always be the same given the same inputs (something that always apply to all patterns used in procedural textures). Cellular textures are built from a scalar function based Implicit Procedural Textures as a means of saving texture memory Page | 3 Joakim Lindqvist LTU Skellefteå 6/4/07 on the distribution of randomly distributed sample points in space. Cellular textures in general give a more sponge, pebble or scaly pattern then noise.

There are a lot of different algorithms for texture patterns, the two most common are cellular and noise. I am not using cellular patterns because of the focus on terrain textures that are generally not cellular in appearance but also because cellular patterns are computationally heavy on a GPU.

Tools FX Composer

There are no tools exclusively made for procedural texturing that focuses on real time textures, but since procedural textures are just a shader FX Composer from NVidia is a fairly good tool. It supports HLSL code and FX files and has the ability to change shader parameters in real time. It has a library of simple meshes you can use and a .X model loader. A very good feature is the ability to call C functions to create explicit textures, even though this textures have a few limits in their use. These explicit textures can be saved to a file and that is a feature I used abit, most notably for creating different permutation tables.

Illustration 2 : FX Composer Page | 4 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

ShaderPerf

Shaderperf is an application from Nvidia that gives theoretical values of shader performance on different hardware together with the total number of different types of shader operations. You can not see things as when branching gives you overhead or other type of flow information but its the closest things there is to a profiler. It is also available as a built-in part of FX Composer.

PIX

PIX stands for Performance Investigator for Xbox (now simply PIX, its now available for Windows as well) and with it you can capture every call made to DirectX and view every resources allocated through it. It is a great tool to optimize performance of your DirectX applications.

Shaders I make heavy use of dynamic branching in my shader, I will explain why in my implementation when I talk about the texture I was replacing. But first a few things to note about dynamic branches. In recent shader profiles we can make dynamic branches which means a branch in the code that can be evaluated differently for each pixel. This gives us great opportunities for optimization, classic examples are when doing shadow maps, but they are a great way to create an optimized atlas of procedural textures as well. A GPU process 4x4 pixels at the same time to keep all shader processors busy[13]. Because all of them needs to take the same amount of time to execute (due to parallelism) and dynamic branches in nature take different time we get latency issues. This means that if two pixels take different paths in the branch all branches needs to be evaluated for each pixel, this means great care needs to be taken when using dynamic branches. Note that if each pixel in the 4x4 group takes the same path only the one branch is evaluated. Also note that 4x4 applies to ATI X1800 architecture, its possible that other cards use other thread sizes but the general idea of minimizing different branch paths in a single thread still applies.

Shading If pattern is what makes a texture look lifelike the shading is what makes it represent anything at all[1]. Shading is the color of the texture and is most often calculated using a diffuse reflection model(Lambert reflection model) but it can be created using a mathematical function or in whatever other way you can think of to color an area. The shading of a texture is usually built together with the pattern since in a lot of cases the height of a texture (indicating different areas) is shaded differently.

Pattern Patterns are extremely important in a texture[1]. They are what make the difference between something that is lifelike and has a cartoon-like feeling. Patterns are usually modeled after the surface we are trying to represent(in the case of building textures representing real life surfaces). Most surfaces can be described at least to some simple degree by a Implicit Procedural Textures as a means of saving texture memory Page | 5 Joakim Lindqvist LTU Skellefteå 6/4/07 mathematical formula of some kind; a simple example could be a smoothstep function describing the height of a brick texture.

Illustration 3 : Smoothstep Function [1]

Illustration 4: Procedural Brick Texture [1]

Patterns are usually present in regular textures as well and are used in a lot of algorithms usually in the form of height maps, for instance when calculating the normal of a surface (normal mapping).

Noise

Noise Terminology Lattice

A grid of values that are regularly spaced i.e. It is an even length between the cells. Think integer values as lattice points with fractional (float) values in-between them.

Gradient

A vector representing the slope of an N-dimensional function

Permutation Table

A table of n unique elements randomly distributed with values from 0 to n-1.

Simplex A simplex is the least complex polygon that can be used to cover an entire area; it should have as few corners as possible. The simplex in 2D is a triangle; in 3D it is a tetrahedron. The number of corners of a Simplex of N-dimension is N+1.

Value and Gradient Noise There are two major types of lattice noise, value and gradient[9]. Since they are both lattice noises they use a permutation table of a suitable size (256 or larger). The difference comes Page | 6 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07 from how they read and use the values of the permutation table. Value noise access the permutation table once for each lattice point and then uses a cubic interpolation to calculate a noise value. This means a lot of lookups. Another approach, the one used by gradient noises, is to use the permutation table to generate random gradients and then interpolate the gradients to determine the noise value. I focused entirely on gradient noise because they give a much better noise. Ken Perlin has written a lot about Noise in different situations[3][5][6] but also Marc Olano has presented an interesting implementation in his mNoise[4].

Perlin Noise The original noise implementation was developed by Ken Perlin[3] and as such is named Perlin Noise. It is a gradient based noise that picks evenly distributed gradients around a unit cube. It picks gradients by hashing the input value into a scalar value that can be used to index into a permutation table of suitable size (256 in most cases). These values are then interpolated using a cubic function with derivates of 0 at 0 and 1 i.e. −2t33t 2 . Those derivates are used because any other derivate would result in sharp edges and thus artifacts. The big drawback of Perlin Noise comes when we try it in higher dimensions, it is grows by a factor of O( 2n )! See Illustration 1 for an image of 2D noise.

Improved Perlin Noise An improvement to the regular Perlin noise was developed by Perlin himself, it improves upon Perlin noise in a few aspects[5]. It is called Improved Perlin Noise. First off, instead of picking the gradients from the corners of the unit cube we use a simplex grid to pick gradients. Secondly, Perlin also recognized issues with the interpolation function since it did not have ideal second order derivate (that is, just like the first derivate, being 0 at both 0 and 1). So it was changed to: 6t5−15t 410t 3 . In higher dimensions this noise increases in complexity in the same as regular noise does. Improved Perlin noise can be viewed as just a few bug fixes to the regular Perlin implementation.

Illustration 5 : Improved Perlin Noise Implicit Procedural Textures as a means of saving texture memory Page | 7 Joakim Lindqvist LTU Skellefteå 6/4/07

Simplex Noise Simplex Noise is a noise type developed by Perlin as a way of standardizing noise implementations and it is built for hardware implementation (Perlin provided a hardware example of implementation along with the algorithm)[6][7]. It uses the simplex of the given dimension to calculate gradients. In simplex noise we sum the contributions of each corner of the simplex to determine the noise value at a given point. This point is only influenced by the corners of the simplex it is in. This means very few calculations to determine the noise, especially in higher dimensions(it grows by O(n)). A big problem with simplex noise is the difference in appearance from regular Perlin noise.

Illustration 6 : Simplex Noise

mNoise A noise implementation created by Marc Olano was mNoise, a way of making noise calculations cheaper on graphics hardware[4]. There have been many tries to implement efficient noise[10] on GPUs and mNoise is by far the most efficient[8]. The problem with implementing Perlin noise on graphics hardware is first of that we cannot transfer a permutation table in any way but need to do it as a texture lookup. Table lookups are usually fast on a CPU but texture lookups can provide a bottleneck on modern GPUs. Instead mNoise use a pseudo-random generator to generate the gradients. These gradients are then interpolated as in Improved Perlin Noise to generate noise. The pseudo-random generator used is a Blum Blum Shub(BBS) generator[11]. It can calculate a pseudo-random number in only 5 instructions though this number repeats at M (the factor used to calculate the random number). M is defined as the product of two factors, p and q. They should be primes but I find that just trying different values of the factor can present good results. It is important to remember that mNoise has artifacts because of the repetitions of the BBS. As you can see from the illustrations below there are obvious artifacts with mNoise, note also that when M=1 we see the underlying lattice grid. This is because the BBS repeats at every lattice point. Page | 8 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

Illustration 7 : M = 1 Illustration 8 : M = 61 Illustration 9 : M = 151

Fractal Sums Fractal sums refer to taking several octaves of noise and summing them together and multiplying them by a factor[9]. That means it is a Fractal Brownian Motion (fBm). Usually you use two variables to affect the look of an fBm, lacunarity and gain. Gain is the factor you multiply the scale with every octave. Lacunairty is the distance between points in the function, changing lacunarity means changing the frequency between each band of noise. There are two types of fBms that are so common they have their own names. 1/f noise is an fBm were gain = 1/lacunarity. The other is turbulence were you take regular fBm but you add the absolute values of each octave together. The difference between the noise created by turbulence and fBm is that turbulence is always positive while fBm will have 50% of it is noise in the negative spectrum. Seen as a texture, turbulence gives a more billowy feeling while 1/f noise is cloudier. Note that turbulence creates more high frequency content and as such can create more aliasing then a regular fBm. The illustrations below are created using mNoise as the noise algorithm.

Illustration 11 : Turbulence Illustration 10 : 1/f fBm Implicit Procedural Textures as a means of saving texture memory Page | 9 Joakim Lindqvist LTU Skellefteå 6/4/07

Aliasing When using regular textures you have mipmaps to remove aliasing from textures but when viewing a procedural texture it is a lot harder to anti-alias[9]. When using noise it is easy to use a frequency clamping alias filter to remove the high frequencies of the noise texture when viewed from afar. Since noise has a known average value we can by computing the filter width implement a filter by computing the ddx and ddy of the current screen coordinate and then adding the squared derivate to get the filter width. Page | 10 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

Implementation I tried to replace regular textures with procedural counterparts as a way of saving video memory and samplers and to try and find any drawbacks with this method. My implementation consisted of a group of shader functions that calculate an implicit texture used in the terrain engine. I implemented everything using DirectX 9, in my concussion I talk about DirectX 10 and what it could mean for procedural textures.

Implicit and Explicit texturing Since we wanted to lower the amount of video memory usage we had to use an implicit form for the material texture. The normal map on the other hand could not be created implicitly (since we would have had to calculate the values of every pixel around the current one). So I decided to create this explicitly (to make sure that the normal map always used the same height map as the diffuse texture). This approach had a few issues, the most dominant being problems with the driver throwing infinite loop errors when the explicit texture was too high a resolution. I could never find anyone else reporting this issue or a way to disable it from the driver. I later moved this into an application I called BumpAtlas Generator.

The noises Simplex noise in 2D has the same computational complexity as mNoise but with texture reads instead of arithmetic operations. This means mNoise is a bit faster than simplex noise but one have to remember that artifacts is always a possibility. Artifacts might present a problem in some situations but I resolved it by carefully selecting M product. I generally try them both and see which gives the best result although in general I find that simplex gives the best noise and mNoise give better fractal sums, but that is just a personal preference. Olano recommends using 61 at all times but I found a liking for the factor 151. I implemented Improved perlin noise as well but it meant more textures lookups and operations then Simplex noise with no gain so I choose never to use it.

The texture The texture in question was a simple 2D grid atlas of different textures used in the terrain; this in turn required me to use a branch to make it generate the different textures. Considering this and the amount of operations needed to create procedural noise the only profile usable is ps_3_0 (for HLSL) and above. A dynamic branch is also very useful as a way of optimizing the shader since most of the time only one of the branches will be executed. Some of the textures in the grid are a blend between some of the other textures so I split them up in individual functions and blend between the functions. This opens up to control the blending between the textures using game data and not just texture blends but modifying this is beyond the scope of this thesis. Implicit Procedural Textures as a means of saving texture memory Page | 11 Joakim Lindqvist LTU Skellefteå 6/4/07

Illustration 12 : The Material Atlas Texture

Compilers I had a few issues with the shader compilers when compiling large shaders. Most issues were with fxc for DX9, it had serious performance issues and also sometimes failed to compile (got stuck in infinite loops, might had been extreme performance problems as well but several hours of compile time for a shader is unacceptable). The problems with cgc that I encountered were mostly it is inability to optimize the code in a similar fashion to fxc. I found that by changing compiler to the newest I could find (the fxc shipped with DX10) most issues of speed were resolved.

Permutation table The permutation table used was generated using a simple Python script that generated an n sized table. I created a procedural function in FX Composer that filled a texture with the array. This texture was then saved to a file to be used in the engine tests. A shader cannot do linear lookup in a array with any reasonable speed (the compiler turns it into a lot of operations, close to 100 operations with a 64 element array) so texture lookups are the only choice.

An issue with noises based on the permutation table and dynamic branching that I discovered was that it is not allowed to read from a texture in a dynamic branch when you have modified the texture coordinates i.e. when you are reading them from a temp register (r#). This made me use mNoise in all my shaders. I later went back to check this and discovered the problem to be the mipmap level determination in regular texture lookups. Using the lod operations instead and manually calculating the mipmap (in my case setting it to 1 since the permutation table is not mipmapped and uses a point filter) solved this issue. Page | 12 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

Decals I tried a few ways of drawing decals on top of the procedural textures to generate more detail. I used noise with different filters placed in the alpha channel (similar to texture splatting techniques[12]) to place the decals. For it to look good the decals needed to be textured quads, so I generated the noise based on integer gradients that I then passed through a filter keeping only some of them (used both high and low pass filters). Then I recalculated the texture coordinates to unit size and used as a lookup in the decal texture. I also used the value of the integer noise to determine what decal to use. This all looked as it should but with the big problem of the underlying integer grid being way too obvious. It also had problems with the distribution of the decals being way to random but this can be fixed by finding an appropriate seed. Also the decals needed to be scaled and rotated to remove the more obvious repetitions but I never found a way to do this and still keep the integer noise grid for the textured quads (if I removed it only parts of the decal would sometimes show, totally removing all illusions of the decals being part of the surface).

Illustration 13 : Procedural Texture with decals

Fractal Sums I created fractal sums and turbulence versions of all noises. The biggest issues these functions presented is the use of the several octaves of noise, it meant a quick rise in the number of operations used in the shader. In the final textures I limited the number of fractal sums used to a minimum and if I used them I only did it with one or two octaves. Fractal sums also have a problem with continuity, the noises I use are gradient noises meaning that they tile automatically. A fractal sum on the other hand does not repeat as good, meaning that when a textured quad moves from 1 sharply back to 0 (in the edge of the textured quad) there will be an obvious texture seam. This could be fixed by using values that are always continuous in the world, for instance world position. I looked into making the fractal sums repeatable seamlessly but I failed to find a way that worked.

Creating a copy of a real texture I tried creating a copy of one of the textures that I was to replace, talking with the artist that created it and learning how he did and then tried to mimic it when possible. After a week of work I was nowhere close to get the quality he had in his texture but it did at least resemble Implicit Procedural Textures as a means of saving texture memory Page | 13 Joakim Lindqvist LTU Skellefteå 6/4/07 the same surface and that resulted in a single surface calculation of around 800 operations and that is barely real time on a ps_3_0 card using just that surface and none of the other things normally done in the shader.

Aliasing I implemented frequency clamping on my regular noise functions like Tatarchuk described in her presentation from GDC '07[9]. Filter width was calculated using the ddx - ddy instructions in HLSL similarly to how Tatarchuk described it. I felt it worked really well and was quite easy to do as well.

Illustration 14 : Procedural Illustration 15 : Procedural texture without filtering texture with filtering

Blending The current Terrain Shader blends the results of several textures into a final result meaning that the procedural texture will be sampled several times each frame. This presented a serious performance limitation on the texture meaning one or two octaves of noise were all that could be used in a single material. But this also proved to remove lots of the details in most of the textures so having a material with a fairly limited amount of details looked a lot better than I initially expected.

Explicit Normal Map I created separate functions for determining the shading and height of a texture thus making the creation of normal maps easier. But the creating of a normal map is not feasible to do in an implicit shader at real time speeds so I choose to do it as an explicitly generated texture; it was still created on the graphics card though. I had issues with the long draw time of the texture (the driver reported stalling) and found no way of circumventing it, this meant a limit on the resolution generated explicitly. In the case of my shaders a max limit on 2048 * 2048. Since the Normal map in no way depended on game data it was possible to move the creation of the texture to a separate application (called BumpAtlas Generator) which I did since the render time for a sufficiently large texture was to long (almost a minute). Page | 14 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07

BumpAtlas Generator was written using Direct3D 9, it creates a render target of specified size (default is 2048 * 2048) and outputs the result as a DDS file encoded in ARGB. It also renders the mipmaps using lower resolution render targets and the same shaders.

Results Here are some screenshots of the different versions of shaders I created together with the original shader for comparison. With the low resolution height map (i.e. The height map saved as alpha component in the normal map) I removed the entire dependency on the MaterialAtlas texture(see Illustration 12) which was about 20MB.

Illustration 16: Original Screenshot Implicit Procedural Textures as a means of saving texture memory Page | 15 Joakim Lindqvist LTU Skellefteå 6/4/07

Illustration 17: Procedural texture with sampled low resolution height map

Illustration 18: Procedural texture with sampled high resolution height map Page | 16 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07 Discussion

Conclusion

Shaders Shaders are still a fairly new GPU function and thus have some issues. The compilers were still quite bad and had a lot of performance issues when compiling large shaders. I also had hard time finding information on my issue with texture reads from temporary registers in dynamic branches.

I found some limit is in regards to shaders and what can be done when texturing implicitly. The most noticeable is the obvious limit is in regards to what information you can access, when you cannot get information on the pixels around you blurs and similar effects are made highly inefficient. Note that this refers to one pass shaders, this could be solved by multi pass rendering but I did not feel it to be worth the cost. The limit is in reading from arrays and the amount of data you can pass limited techniques like cellular texturing and noise implementations that did not use textures.

With newer graphic cards getting more and more processing power but not growing as rapidly in texture performance procedural solutions to textures becomes a more and more interesting solution to get better performance. I noticed a fairly bad performance on graphics card of the last generation (Nvidia 6600) so this is more interesting as a performance gain in newer cards.

Tools A big issue with procedural textures is the lack of proper tools with which to build them. I spent a lot of time in FX Composer building the shader needed. This worked out okay but it is not the most stable application and there is also quite a difference between seeing a texture on a mountain side and seeing it on a flat quad. A proper tool would have support for both implicit and explicit texturing (with explicit here I mean an option to save the textures generated). I would also like a tool to have a work flow more similar to professional image tool (like Photoshop) that could save these images into procedural shaders. This feels like a must for an application that uses procedural textures to a large degree.

Higher dimension noise I have found that focusing on replacing 2D textures with 2D procedural textures makes it fairly hard to get good looking results. I think that working with higher dimension noises, mostly 3D noise to generate a texture dependent on world position would give a better result. The problem there in lies with lose of control with the result. This could mean building the entire terrain using a 3D noise algorithm and then building the texture from the same data that you use to create the vertex data to create the texture. Implicit Procedural Textures as a means of saving texture memory Page | 17 Joakim Lindqvist LTU Skellefteå 6/4/07

Drawbacks Implicit textures are operation wise a lot more expensive than regular texturing. When texturing regularly a low resolution means a pixilated texture, this may not be ideal but it could be acceptable in certain situations. With implicit textures it means a fill rate issue and that results in low frame rate, in general something you absolutely do not want. This implies a limited resolution for a given hardware. The fact that a lot of applications today are already fill rate limited does not make this better.

As I mentioned when discussing tools, another big problem in procedural texturing is content creation. Because of the close relation with mathematics and inability to perform changes on detailed parts of the texture it is very hard to get a texture that has the same detail as a regular texture. I solved this by combining the two types of textures, which worked good for me. Though since I am sampling a height texture in my procedural texture I lose the resolution and size independence. It could be solved by using a high resolution height map. But this does not eliminate the problem, it merely hides it.

Other advantages of implicit procedural textures Since procedural textures are naturally built using a set of variables to control the texture. In my case an obvious example would be a snow coverage variable that I calculate from the height map currently. In the current texture a blend between four textures are used to create snow coverage, this could be changed to just changing this variable and then calculating the blend between the textures when calculating the procedural texture. So in a more general sense implicit textures can be changed by in-game variables much smoother then a regular texture can.

How to create the 2D textures we want I've found three ways of creating 2D textures procedurally to replace an existing texture.

One can definitely conclude that procedural texturing is a valid option for replacing a limited type of textures but the question then becomes, how could this be done?

The first way is the most straightforward and obvious solution, to generate a texture using only procedural functions, like noise and methods derived from noise (like fractal sums). The problem with this method is creating the details you want and still keep the shader real time, i.e. keeping the operations to a minimum. I found this to have lots of limit is in what you could create as well since we are limited by what we can do in a shader (for instance we cannot blur or use techniques like cellular texturing). But I do feel this to be a valid option for really simple textures.

The second way I tried used a procedural function to generate the large areas of colors and structure in the texture, but then used a tiled 2D texture above it all to add details. The big issue here was of course the use of a second texture since we wanted to lower the amount of samplers used and not just the amount of video memory we used. The tiling of a low Page | 18 Implicit Procedural Textures as a means of saving texture memory Joakim Lindqvist LTU Skellefteå 6/4/07 resolution texture (low resolution since we wanted to save memory) meant obvious tiling in most cases. I found that this way was generally the worst way of generating a good texture.

The third and last technique I tried created a general color and structure of a texture like in the second technique but this structure was then exported to a height map texture in a high resolution (I used 2048*2048 because of technical limitations, would have wanted a even higher resolution). An artist would then edit this texture to add the fine details, similar to how he added details in the height map if he were creating a regular texture. This height map would then be used to generate a normal map and the height would be saved in the normal maps alpha. The height map would be blended into the previously generated procedural diffuse map. The normal could be calculated in the shader to generate a highly detailed normal at close distances if one wants. Illustration 17 shows the results with a low resolution height map. The obvious pixels are because of the low resolution height map but not as obvious is the lowered frame rate that is more of an issue. When taking that picture I had an estimated frame rate of about 5 per second, a very bad performance though remember that this was on an Nvidia 6600. The reason for the low performance is simply the amount of operations used.

Note that none of these techniques try to remove the normal map generated this is because generating a normal map procedurally is in most cases impossible in real time. A procedural height map is simply too much operations, considering that you would need to sample it four times (in my case five times since I save the height map in the normal maps alpha).

A few techniques I considered but ultimately dropped were mostly different types of multi pass solutions to generate the textures. By going multi pass a procedural texture could be built to use more advanced techniques (like calculating a normal map procedurally) the increase in draw calls and draw time made me feel that this was not worth it.

Further work I would like to try building the terrain rendering and texturing with procedural textures in mind, say using 3D noise to generate height maps for the terrain and then perhaps building a texture from that as well. That is, not just replacing 2d textures with procedural representations but making the vertex data and textures are built using the same data.

Writing these shaders in DirectX 10 would mean a better performance than what I have now, except for the obvious wins that come from a faster GPU it is feature to read back output data without CPU interference should give me more techniques I could use in the shaders. For instance the normal map could be created as a two pass render. Also DirectX 10s resource view and integer calculations on the inputs could mean a great win for any gradient noise because of the permutation table being integer only. Implicit Procedural Textures as a means of saving texture memory Page | 19 Joakim Lindqvist LTU Skellefteå 6/4/07

References [1] Texturing and Modeling - A Procedural Approach 3rd edition.: David S. Ebert, F Kenton Musgrave, Darwyn Peachey, Ken Perlin, Steven Worley;(2002); ISBN : 1–55860–848–6; Morgan Kaufmann Publishers

[2] Lefebvre S., Neyret F.: Procedural Pattern Based Texturing: Proceedings of the 2003 symposium on Interactive 3D graphics (2003 ); pp 203 – 212.

[3] Perlin K.: An image synthesizer: Proceedings of the 12th annual conference on Computer graphics and interactive techniques (1985); pp 287 – 296

[4] Olano M.: Modified Noise for Evaluation on Graphics Hardware: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware (2005); pp 105 – 110

[5] Perlin K.: Improving Noise: Proceedings of the 29th annual conference on Computer graphics and interactive techniques (2002); pp 681 – 682

[6] Perlin K. : Noise Hardware: Real-Time Shading SIGGRAPH Course Notes (2001), Olano M., (Ed.).

[7] Gustavsson S.: Simplex Noise Demystified: http://staffwww.itn.liu.se/~stegu/simplexnoise/simplexnoise.pdf (Last viewed: 2007-05-18)

[8] Fuller, A. R., Krishnan, H., Mahrous, K., Hamann, B., and Joy, K. I.: Real-time procedural volumetric fire: Proceedings of the 2007 Symposium on interactive 3D Graphics and Games (2007); pp 175-180

[9] Tatarchuk N.: The Importance of Being Noise: Fast, High Quality Noise:Game Developer Conference 2007: http://developer.amd.com/assets/Tatarchuk-Noise(GDC07-D3D_Day).pdf (Last viewed: 2007-05-18)

[10] Green S.: Implementing Improved Perlin Noise In:GPU Gems 2.: Pharr M.(Ed.);(2005); IBSN: 978-0-321-33559-3 Addison- Wesley Professional; pp 409-416

[11] Blum L., Blum M. and Shub M.: A Simple Unpredictable Pseudo-Random Number Generator; SIAM Journal on Computing (1986); pp 364 – 383

[12] Bloom C.: Terrain Texture Compositing by Blending in the Frame-Buffer: http://www.cbloom.com/3d/techdocs/splatting.txt (Last Viewed: 2007-05-25)

[13] ATI: Radeon X1800 Shader Architecture: http://ati.amd.com/products/radeonx1k/whitepapers/X1800_Shader_Architecture_Whitepaper.pdf (Last Viewed 2007-05-26)